Reorganize, Revise and Edit this pap(About 600 words)

Read the requirement carefully!
Need to Finish in 1 day!
asap24 hours
ATTACHED FILE(S)
Page 10

Page 10

Page 10

Social Networking Analysis Final Project:
The Marvel Universe Social Network
Add Abstract: should not be more than 300 words

INTRODUCTION
In the mid 2017s, the Marvel Cinematic universe (MCU) topped a 12-billion-dollar milestone in box office revenues6 and recently released its trailer for Avengers Endgame which garnered 48.7 million views on YouTube7 in a day of release. It is a passion for many comic book fans and a weekend’s
entertainment for the general public. Our final project is motivated by the inner geek to explore the universe of the marvel heroes and quantify our pre-existing notions about the universe.
Our research question was motivated from our interests and we wanted to analyze the role and influence of each hero in the network and how heroes drive strategic decision making for the studio. In upcoming sections, we will be quantifying the power of the network and see the influence of heroes through differentiation.
Marvel Universe has little over 6,400 heroes (nodes) in the comic world and close to 150 characters in the cinematic universe. This in turns creates a dense network of heroes and there are complexities added to understand the influence of each hero. To begin our analysis, we started with the assumption: heroes have a connection between each other if they appear in the same comic book. Our first step was to take a look at the properties and characteristics of the marvel network by calculating some network dimensions. After the initial analysis, we created network visualization to analyze the ties of the famous
Marvel heroes’ network and for that we used Gephi, a visualization and exploration software. Our analysis also includes the collection of Twitter data and topic modeling.

Related Work
The process of collecting, cleaning, filtering and planning data was critical for the analysis of the network. Data was obtained from the given source file on Kaggle which included the network between all the heroes (574,467 ties) in the marvel universe and the comic they have been a part of. To look into the MCU, we decided to mine data on the Twitter and compare with pre-obtained data for our analysis.

Add 150 words here

Approach
DESCRIPTIVE STATISTICS
For the understanding of the Marvel universe, we decided to construct one undirected networks to explore the social network of the heroes. The Marvel comic network, which includes 6,426 unique comic heroes and 574,467 edges. For the networks our assumptions were: The more times a hero appears in a comic, the more popular it is.We will focus on the Marvel comic network, by adopting this methodology, we were able to get the result and analyze.
The marvel comic network is one of the most unique networks. It is constructed by interactions of fictional characters but also at the same time, it is very complex and humongous. To understand the network complexities, it is justified to provide some network dimensions and descriptive statistics. Nodes depict superheroes and edges depict co-occurrences of superheroes within a single Marvel Comics issue.The MCU has a network diameter (the largest distance between any two nodes in the network), of 7 units/steps. The average degree (how connected the node is) is 34.027. Clustering coefficient is 0.53, showing that the two heroes that have collaborated with another hero, are much more likely to collaborate in the future than a randomly chosen pair. The average path length (the measure of characteristic path length completed over geodesics or the shortest path) is 2.889, thus, any pair of heroes can be connected through an average of 3 collaborations. The network has a relatively high Modularity which is0.49, showing the dense connections between the heroes. The graph density (the relative fraction of edges that are present over total edges minus one) is 0.003. This indicates the comic network is quite sparse and several heroes acquire most of the edge connections. Most of the comic heroes are peripheral nodes and have only a few appearances. The comic universe is sparser. Table 1 shows the network dimensions.

Graph density

Network Diameter

Average Path Length

Modularity

Average
Degree

Hero social network

0.003

7

2.889

0.49

34.027
Table 1 – Marvel hero network descriptive statistics
Moving forward, we reviewed the individual influencers of the MCU. Our review involved a detailed analysis of specific nodes to garner some powerful insights. In terms of degree centrality (how well a node is connected in terms of direct connection), Captain America, Spider Man, and Iron Man are the top three heroes. This is proven by the fact that they are the oldest and most successful heroes. Betweenness (how well situated a node is in terms of the paths that it lies on), a type of centrality measure, can also be defined as the number of times of a node passes through the shortest distance between any two nodes in the network. In terms of betweenness centrality, Wolverine plays the most important role in conveying collaborations (information) to other hero in the network who are not connected with each other. In the MCU, Wolverine acquires the control on the collaborations between different clusters since he is the main connection between the X-men and Avengers. Closeness centrality (how close a given node is to any other) represents the ability of spreading information to the whole network within a shortest time frame. Mister Fantastic is the character who has the ability to reach all other hero as quick as possible, since he is the leader of Fantastic Four and the center of the comic universe. Page rank algorithm, (first used by Google) was utilized to calculate the relative score of a webpage’s authority and importance level. Unsurprisingly, Spiderman has the highest page rank centrality since he has the direct connections to other influential heroes such as Iron Man and Captain America. We also notice that all the connected heroes to Spiderman have relatively higher page rank score.
GEPHI PROCESS
In order to gain a further understanding regarding tthe comic network, we created network related visualizations with the help of external software – Gephi. With the assumption that edge strength is based on whether two nodes appear in the same comic , we created the following figure.

Figure 1 – Marvel comic universe network

Figure 1 represents the Marvel comic universe network by ties with different strength. White ties represent weak connections between pure comic heroes, blue ties represent medium connections between comic nodes and movie heroes, and red ties represent strong connections between movie heroes.
Twitter Data Collection
The data collection process was done using “Tweepy” library which consumes behind the scene Twitter API allowing us to scrape tweets based on a given keywords and start posting time of the tweets we are looking for. For this we need to apply for a developer account on Twitter, in order to get an api key and access token which are mandatory for legal data scraping. Due to some limitations on the number of requests we are allowed to do in a time frame, we only used five runs with a sleep time of 15 minutes so that we don’t abuse the scraping process and also not getting blocked by the robot behind.At the end we were able to collect 10000 tweets related to the following keyword : #marvel.
Twitter Data preprocessing
The data preprocessing step was about preparing our collected data to be feeded to models either LDA for topic modeling or sentiment analysis. The process involves transforming the raw data into a clean data by following these steps :
– lowercase text
– remove whitespace
– remove numbers
– remove special characters
– remove emails
– remove stop words
– remove additional stopwords
– remove weblinks and mentions
– lemmatization
After removing all the noise surrounding our tweets, we then move to the most important step which is Topic Modeling,
Topic modeling
The topic modeling process is aiming to cluster our dataset of tweets into some homogeneous topics (clusters) that share the semantic behind using Latent Dirichlet Allocation (LDA), the algorithms takes as parameters the text corpus collected from tweets and the number of topics you want to cluster your data on. Until today there is no hard science allowing us to figure out the the number of topics on which our corpus might be clustered on, so the idea here is to make a hypothesis of a set of k clusters from 2 to 17 for example and start measuring the coherency on each topic, at the end we get some sort of dictionary where keys are the K clusters and values are the coherency metrics “Umass”, and then we pick the “K” value with the highest coherency, that’s the best K clusters your corpus is able to be splitted on. In our case our corpus was having a k around 3 and 4 depending on the execution run.
Twitter Data Analysis
1. Increasing dataset effect:
Increasing the number of tweets will definitely push our corpus to be bigger than the actual one, hence the number of topics covered in the corpus will be increased and diversified, from a sentiment analysis perspective, this will also make the sentiments distribution over text more diverse and non balanced since the data collection doesn’t have any rule based approach.
2. Sentiment Analysis
Figure 2 – Sentiment Scores
As we can see in figure 2, there is a high positivity circulating over the sentiments of Marvel related tweets, with over 8000 tweets, while on the other side Positive and Neutral are around 300 for each class.
Figure 3 – Topic Sentiment
As we can see there is a high dominance of positive tweets over the topic_1, topic_2 and topic_3 clusters and this is a very good point that could be interpreted with the good quality of clustering tweets by LDA since it was able to cluster all positive tweets within each other without knowing the sentiment before.
3. Top Hashtags
Figure 4 – Topic Hashtags
The plots above shows that there is a quite similarity between the the most common hashtags especially between topic_2 and topic_3 at the top 3 unigrams, meanwhile the topic_1 still a bit far from those two contexts as shown in the “pyldavis” figure before (refer to notebook), hence we can conclude that there is a correlation between the most used hashtags and the likelihood that a given document belong to a topic.
4. stop-word removal effect:
The stop-words removal is an essential step in Data Preprocessing for NLP text analysis, the main goal is to remove noisy words that won’t help the machine learning models in their predictions and keep only those pertinent and meaningful words, in addition the count of stop-words sometimes is very big which might absolutely bias the model into a specific class or a topic.
In the case where stop-words were the most used unigrams on Twitter, our corpus will be very fined in terms of unigrams and the extracted topics will be very meaningful since they won’t be using the most common words, but in other side we risk to lose the context of the topic, so it might be a good approach but with a certain threshold.
5. Punctuation removal effect:
An important NLP preprocessing step is punctuation marks removal, these marks – used to divide text into sentences, paragraphs and phrases – affects the results of any text processing approach, especially what depends on the occurrence frequencies of words and phrases, since the punctuation marks are used frequently in text. By keeping them results might be biased due to the noisy data that would still be in our corpus without helping too much either in the clustering process for topic modeling or the prediction of sentiment analysis.
6. Word Stemming effect:
Both stemming and lemmatization provide better results removing semantic duplicates. This allows returning the user more words related to the topic so he can have a better understanding of it. However, stemming adds noise to the results as it includes stems that are not real words, and that’s why we used only lemmatization in this process of topic modeling so we could have a better interpretation of the extracted topics.
7. future implementation
As a future improvements on this process, we can try augmenting the dataset and try a new approach for topic modeling with other algorithms like LSI Latent Semantic Indexing which is very powerful in keeping the semantic preserved between clustered topics, also for sentiment analysis we can use transformers who are very powerful on these kind of tasks since they are trained on very large corpus.
Conclusion
Our topic modeling result provides valuable information on heroes and their interactions. It is noted that some heroes are very close in the comics but not in the movies. Marvel should focus on such interactions to gain and do something remarkable to gain audience attention. As phase four of the MCU rolls out beginning 2019, there is a scope of being able to provide the audience with some of these interactions. For example: Falcon and Captain America are considered as a strong duo but in MCU they have had not a strong relationship.
Based on our Marvel network analysis results, it is recommended that the stakeholders and the producers should focus on the interactions of different movie series for the MCU and bring more new and centralized characters from comic universe to the movie world. Implementing different strategies towards to central and peripheral heroes in the MCU is vital. From the network analysis result, having a breakdown in villains and heroes will help us see the effect in potential interesting.
Add 150 words here
Team 9 December 10, 2018
Team 9 December 10, 2018
Team 9 December 10, 2018
1. Reorganize the format of the paper so that it meets the requirements of the
ACM SIG Template
2. Reorganize the structure
The following is a suggested structure for the report:
• Title, Author(s)
• Abstract: should not be more than 300 words
• Introduction: this section introduces your problem, why it is
important, who cares, and the overall plan for approaching your
problem
• Background/Related Work: This section discusses relevant literature
for your project
• Approach: This section details the framework of your project. Be
specific, which means you might want to include equations, figures,
plots, etc
• Experiment: This section begins with what kind of experiments
you’re doing, what kind of dataset(s) you’re using, and what is the
way you measure or evaluate your results. It then shows in detail
the results of your experiments. By details, we mean quantitative
evaluations (show numbers, figures, tables, etc) or qualitative results
(show images, example results, etc).
• Conclusion: What have you learned? Suggest future ideas.
• References: If you borrowed the idea from another work, please
give credit.
3. Add something meaningful. (Already marked in red)
4. Revise the paper if there is an incoherence or grammatical error

https://www.acm.org/publications/proceedings-template
Insert Your Title Here∗
Insert Subtitle Here

WOODSTOCK’18, June, 2018, El Paso, Texas USA

F. Surname et al.

Insert Your Title Here

WOODSTOCK’18, June, 2018, El Paso, Texas USA
FirstName Surname†
Department Name
Institution/University Name
City State Country
email@email.com
FirstName Surname
Department Name
Institution/University Name
City State Country
email@email.com
FirstName Surname
Department Name
Institution/University Name
City State Country
email@email.com
ABSTRACT
In this sample-structured document, neither the cross-linking of float elements and bibliography nor metadata/copyright information is available. The sample document is provided in “Draft” mode and to view it in the final layout format, applying the required template is essential with some standard steps.
These steps, which should require generation of the final output from the styled paper, are mentioned here in this paragraph. First, users have to run “Reference Numbering” from the “Reference Elements” menu; this is the first step to start the bibliography marking (it should be clicked while keeping the cursor at the beginning of the reference list). After the marking is complete, the reference element runs all the options under the “Cross Linking” menu.
For accuracy check of the structured paper, user can run the option Manuscript Validation. It informs the user of the wrong or missing values in the paper. The user must correct the paper as per validation messages and rerun Manuscript Validation.
Now, to generate the required layout of the paper, the user should select one of the template styles under the Define Template Style option (choose the required layout design, i.e. choose between Journals and Proceedings).
∗Article Title Footnote needs to be captured as Title Note
†Author Footnote to be captured as Author Note
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).
WOODSTOCK’18, June, 2018, El Paso, Texas USA
© 2018 Copyright held by the owner/author(s). 978-1-4503-0000-0/18/06…$15.00
https://doi.org/10.1145/1234567890
Some specific values are required to create a standard layout by choosing a template for the journals or proceedings. So once the user chooses one of the template layout styles, the respective Journal or Conference details dialog box (i.e. journal/conference acronym, DOI, ISBN, copyright, year, etc.) will appear as a prompt during the Define Template Style functionality. The user should fill these values, after which the template creates the desired layout of the paper. The user can now create a PDF of his/her manuscript using the “Save as PDF” option.
If the user is adding any new data, they should make sure to style it as per the instructions provided in previous sections. Carry out the steps for Cross-linking, Fundref data, adding Document History (specific to journal submission), and finally, Manuscript validation and placing the respective metadata (Bibstrip/copyright text)[footnoteRef:2] while applying the required template. [2:The existing Bibstrip data, copyright text and permission block in the sample file are dummy values, so the user needs to provide the correct values required for the submission in the metadata dialog box.]
CCS CONCEPTS
•Insert CCS texthere •Insert CCS texthere •Insert CCS texthere
KEYWORDS
Insert keyword text, Insert keyword text, Insert keyword text, Insert keyword text
ACM Reference format:
FirstName Surname, FirstName Surname and FirstName Surname. 2018. Insert Your Title Here: Insert Subtitle Here. In Proceedings of ACM Woodstock conference (WOODSTOCK’18). ACM, New York, NY, USA, 2 pages. https://doi.org/10.1145/1234567890
1 Insert Heading Level 1
The updated template, user manuals, samples, and required fonts, all are available at the URL https://www.acm.org/publications/proceedings-template. It contains said information for all three versions of MS Word (Windows and 2 versions of Mac). There are also separate links to the user guide, which can be referred to by the user. This URL also contains some useful video links, which describe how to add the template, structure the paper, and generate the layout, in different clips. Display Formula with Number
(1)
Continuation part of Paragraph Text The user must style this paragraph in ParaContinue style, which follows immediately after the DisplayFormula (numbered equation). The DisplayFormula style is applied only in case of a numbered equation. A numbered equation always has a number to its right. Insert paragraph text here. Display Formula without Number
The DisplayFormulaUnnum style is applied only in case of an unnumbered equation. An unnumbered display equation never contains an equation number to its right, and this unique property distinguishes it from a numbered equation.
Figure 1: Figure Caption and Image above the caption [In draft mode, Image will not appear on the screen]
Theorem/Proof/Lemma. Insert text here for the enunciation or Math statement. Insert text here for the enunciation or Math statement. Insert text here for the enunciation or Math statement. Insert text here for the enunciation or Math statement. Insert text here for the enunciation or Math statement.
….Insert text here for the Quotation or Extract, Insert text here for the Quotation or Extract, Insert text here for the Quotation or Extract, Insert text here for the Quotation or Extract, Insert text here for the Quotation or Extract, Insert text here for the Quotation or Extract.
1.1 Heading Level 2
In the below paragraph, it is explained how alt-txt value is placed in MS Word 2010. To add alternative text to a picture in Word 2010, follow these steps:
1. In a Word 2010 document, insert a picture.
2. Right click on the inserted picture and select the Format Picture option.
3. Select the Alt Txt option from the left-side panel options.
4. In the “Title:” and “Description:” text boxes, type the text you want to represent the picture, and then click “Close”.
Below are steps to place alt-txt value in MS Word 2013/2016. To add alternative text to a picture in Word 2013/2016, follow these steps:
1. In a Word 2013/2016 document, insert a picture.
2. Right click on the inserted picture and select the Format Picture option.
3. In the settings at the right side of the window, click on the “Layout & Properties” icon (3rd option).
4. Expand Alt Txt option.
5. In the “Title:” and “Description:” text boxes, type the text you want to represent the picture, and then click “Close”.
1.1.1 Heading Level 3. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here.
1.1.1.1 Heading Level 4. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here.
ACKNOWLEDGMENTS
Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here. Insert paragraph text here.
REFERENCES
[1] Patricia S. Abril and Robert Plant, 2007. The patent holder’s dilemma: Buy, sell, or troll?Commun. ACM 50, 1 (Jan, 2007), 36-44. DOI:https://doi.org/10.1145/1188913.1188915.
[2] Sten Andler. 1979. Predicate path expressions. InProceedings of the 6th. ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL ’79). ACM Press, New York, NY, 226-236. DOI:https://doi.org/10.1145/567752.567774
[3] Ian Editor (Ed.). 2007.The title of book one(1st. ed.). The name of the series one, Vol. 9. University of Chicago Press, Chicago. DOI:https://doi.org/10.1007/3-540-09237-4.
[4] David Kosiur. 2001.Understanding Policy-Based Networking(2nd. ed.). Wiley, New York, NY..
Conference Name:ACM Woodstock conference
Conference Short Name:WOODSTOCK’18
Conference Location:El Paso, Texas USA
ISBN:978-1-4503-0000-0/18/06
Year:2018
Date:June
Copyright Year:2018
Copyright Statement:rightsretained
DOI:10.1145/1234567890
RRH: F. Surname et al.
Price:$15.00

Place your order
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more