RStudio

It is a project, and we are using RStudio’s
science
ATTACHED FILE(S)

variable

class

description

track_id

character

Song unique ID

track_name

character

Song Name

track_artist

character

Song Artist

track_popularity

double

Song Popularity (0-100) where higher is better

track_album_id

character

Album unique ID

track_album_name

character

Song album name

track_album_release_date

character

Date when album released

playlist_name

character

Name of playlist

playlist_id

character

Playlist ID

playlist_genre

character

Playlist genre

playlist_subgenre

character

Playlist subgenre

danceability

double

Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.

energy

double

Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.

key

double

The estimated overall key of the track. Integers map to pitches using standard Pitch Class notation . E.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on. If no key was detected, the value is -1.

loudness

double

The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typical range between -60 and 0 db.

mode

double

Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0.

speechiness

double

Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks.

acousticness

double

A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.

instrumentalness

double

Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0.

liveness

double

Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live.

valence

double

A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).

tempo

double

The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration.

duration_ms

double

Duration of song in milliseconds

In-class Coding Exercise 2

Go to the Final Projects pdf file of the Canvas website. Peruse the different data sources for
your upcoming midterm/final project and perform some initial exploration. After today, you
should decide which dataset you’d like to use.

Getting Started
Once you’ve selected your preferred dataset (or maybe you can use this exploration to
decide), practice what you’ve learned by:

1. Importing the data
2. Identifying and reviewing the codebook (if available) or website of origin
3. Learn about the data by:
• Assessing dimensions
• Viewing the head and tail of the data
• Identifying the data types of each variable
• Identifying missing data
• Computing summary statistics for the variables
• Check for duplicate rows or columns (You may need to google this! We haven’t
discussed duplicate rows/columns yet.)
4. Learn about the data visually by plotting:
• Histograms
• Bar charts
• Box plots
• Scatter plots

Where To Go From Here
By the end of class, identify which dataset you will use for you midterm/final project. Starting
thinking about what type of questions you would want to ask and answer of these datasets
with your final project. Come to class next week prepared with the data you plan to use and
the research questions you want to use to guide your data analysis. As you develop questions
think about storyboarding in order to build a cohesive story that will use data to create
insights which will lead to action.
In-class Coding Exercises
Start exploring your final project data! Within your group, discuss your final project
data and the 10 specific questions you wanted to ask of your data. Discuss what kind
of transformations would best answer these questions.

1. Start transforming your data to gain new insights. Based on what you learned
this week, some questions you may want to ask are:

• What features could you filter on?
• How could arranging your data in different ways help?
• Can you reduce your data by selecting only certain variables?
• Could creating new variables add new insights?
• Could summary statistics at different categorical levels tell you more?
• How can you incorporate the pipe (%>%) operator to make your code more efficient?

2. Does your final project leverage more than one data set?
• If so, start joining your data sets with the skills you learned.
• If not, try to identify another data set that you can join to your final project data to
make it even more interesting.

3. Your mid-term project eval is due by the end of today’s class. You need to
render this mi term report as an R Markdown PDF file and you need to show
all your code. If you want to incorporate the above tasks into your midterm
you certainly can do so at this time.
getwd()
data = read.csv(“Global Music/GlobalMusicData.csv”)
head(data,5)
tail(data,5)
str(data)
colSums(is.na(data))
summary(data)
hist(data$tempo)
Mid-term Expectations
You may work with one or two other persons on the midterm and final projects if you
wish. Once you decide on working solo or as a group, your decision remains for the
rest of the course (i.e., you can’t decide to work alone or join someone after submitting
the midterm.
Please read the final project page before reading any further.
Throughout the term you will progressively create your final project. Your mid-term
project is to submit the work you have completed midway through the course for a
progress evaluation, where you have fully completed standards 1.1-4.4 and 7.1-7.4 as
shown below. This progress check will allow your peers and me to provide you
direction for final completion. This mid-term report will be rendered as an R Markdown
HTML or PDF product.
Mid-term expectations, which are based on the final project standards, are listed below:

Section Standard
Possible
Points
Introduction 1.1 Provide an introduction that explains the problem statement you are addressing. Why
should I be interested in this?
1.2 Provide a short explanation of how you plan to address this problem statement (the data
used and the methodology employed)
1.3 Discuss your current proposed approach/analytic technique you think will address (fully or
partially) this problem.
1.4 Explain how your analysis will help the consumer of your analysis.
5
Packages Required 2.1 All packages used are loaded upfront so the reader knows which are required to replicate
the analysis.
2.2 Messages and warnings resulting from loading the package are suppressed.
2.3 Explanation is provided regarding the purpose of each package (there are over 10,000
packages, don’t assume that I know why you loaded each package).
5
Data Preparation 3.1 Original source where the data was obtained is cited and, if possible, hyperlinked.
3.2 Source data is thoroughly explained (i.e. what was the original purpose of the data, when
was it collected, how many variables did the original have, explain any peculiarities of the
source data such as how missing values are recorded, or how data was imputed, etc.).
3.3 Data importing and cleaning steps are explained in the text (tell me why you are doing the
data cleaning activities that you perform) and follow a logical process.
10

Section Standard
Possible
Points
3.4 Once your data is clean, show what the final data set looks like. However, do not print off a
data frame with 200+ rows; show me the data in the most condensed form possible.
3.5 Provide summary information about the variables of concern in your cleaned data set. Do
not just print off a bunch of code chunks with str(), summary(), etc. Rather, provide me with a
consolidated explanation, either with a table that provides summary info for each variable or a
nicely written summary paragraph with inline code.
Proposed
Exploratory Data
Analysis
4.1 Discuss how you plan to uncover new information in the data that is not self-evident. What
are different ways you could look at this data to answer the questions you want to answer? Do
you plan to slice and dice the data in different ways, create new variables, or join separate data
frames to create new summary information? How could you summarize your data to answer
key questions?
4.2 What types of plots and tables will help you to illustrate the findings to your questions?
4.3 What do you not know how to do right now that you need to learn to answer your
questions?
4.4 Do you plan on incorporating any machine learning techniques (i.e. linear regression,
discriminant analysis, cluster analysis) to answer your questions?
5
Formatting & Other
Requirements
7.1 All code is visible, proper coding style is followed, and code is well commented (see section
regarding style).
7.2 Coding is systematic – complicated problem broken down into sub-problems that are
individually much simpler. Code is efficient, correct, and minimal. Code uses appropriate data
structure (list, data frame, vector/matrix/array). Code checks for common errors.
7.3 Achievement, mastery, cleverness, creativity: Tools and techniques from the course are
applied very competently and, perhaps,somewhat creatively. Perhaps student has gone beyond
what was expected and required, e.g., extraordinary effort, additional tools not addressed by
this course, unusually sophisticated application of tools from course.
7.4 .Rmd fully executes without any errors and HTML produced matches the HTML report
submitted by student.
15
http://uc-r.github.io/basics#style

Place your order
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more