Siobhan's Programming Journal #3

Of course, the more I work with R, the less daunting it becomes. I’ve seen R before this course and set my mind that I would not like it, but actually I no longer have an aversion to it, and I’m finding I’m able to do more and more interesting things. I like the way I can visualize collected data in so many different ways (which I’ve read about and have not explored for myself as yet, but will eventually). For this exercise, I wanted explore using the plotly feature in R. I just needed to aggregate some simple data just to do a quick plot, so I created my own dataset in Excel capturing the amount of money spent on metrocards for a year. This confused me a bit to clean it all up, and I figured that starting this in R, I wanted to create a simpler dataset within R itself. After reviewing the resource provided about creating line-charts with plotly in R, I sort of followed one of the examples and created my own objects filled with the same data I started to make in Excel. After that, I converted those objects into one large data frame to prepare for plotting. image


I used plotly to capture the dataframe and display it in the graph view, but I noticed something was not quite right. I expected my plot to have more of a continuous flow and trend, but this plot seemed to be all over the place. I knew I was making good progress, so this needed a bit of digging and poking around. I read through some more of the provided document on using Plotly, as well as a few Google searches just to get more information about plotly and what I could do with it for future projects.


In my reading and research, I found that the plot came out so weird because the graph defaults to listing in alphabetical order, and since I plotted months along the x-axis, it listed the labels in month order but plotted the data in alphabetical order, causing the confused-looking graph. I had to add in a line of code-- layout(title = "Budgeted Average Monthly Spending on MTA Fare NYC", xaxis = list(title = "Months") -- to get the data to plot in the correct monthly order. This made my plot a whole lot cleaner!


It was interesting being able to plot the actual data and estimated data via plotly in R. I was also able to hover over points on the plot and see the data for that point on the graph itself, something that I wanted to do in the first R exercise for this semester, so that is also a very useful problem I was able to solve in this R project.. Though I’m nowhere near the level of proficiency I’d like to be in R by now, I feel like I’m making progress and learning more with each exercise.

    • Siobhan Wilmot-Dunbar
      Siobhan Wilmot-Dunbar

      Hey Dana,

      Thanks so much! For this project, I created the data directly in R as a dataset, so I said month<-c('Jan', 'Feb', 'Mar', 'Apr') for example, and did another set for the fare, so Fare<-c(3.75, 19.50, 67.34, 55) and then one more for estimated fare (that i used as the trace line). Then I created a data frame from that by saying TravelData<-data.frame(month, fare, estimatedFare). This created the data frame that I was able to manipulate and use for Plotly. 

      I think if you already have it saved as a csv on your machine, though, you can import it into R like before with other projects and set your x & y elements. This is what I did for this week's project--so my code might look like:  

      plotName<-plot_ly(DataFrame, x=xAxisData, y=yAxisData, type-'typeOfPlot') 

      That's kind of how I was thinking through it so far. I hope this helps!

    Quantitative Literature Analysis Spring 2018

    Quantitative Literature Analysis Spring 2018

    Here is the online home for our Quant Lit Analysis Class for Spring 2018.