bigdata

No Plagiarismquality
ATTACHED FILE(S)

COMP1702

Big Data

Faculty Header ID

Contribution: 100% of course

Course Leader:

Coursework

Deadline Date:
25 April 2022 (23:30)

Feedback and grades are normally made available within 15 working days of the coursework deadline

Learning Outcomes:
1 Explain the concept of Big Data and its importance in a modern economy 2 Explain the core architecture and algorithms underpinning big data processing 3 Analyse and visualize large data sets using a range of statistical and big data technologies 4 Critically evaluate, select and employ appropriate tools and technologies for the development of big data applications
Plagiarism is presenting somebody else’s work as your own. It includes copying information directly from the Web or books without referencing the material; submitting joint coursework as an individual effort; copying another student’s coursework; stealing coursework from another student and submitting it as your own work. Suspected plagiarism will be investigated and if found to have occurred will be dealt with according to the procedures set down by the University. Please see your student handbook for further details of what is / isn’t plagiarism.
All material copied or amended from any source (e.g. internet, books) must be referenced correctly according to the reference style you are using.
Your work will be submitted for plagiarism checking. Any attempt to bypass our plagiarism detection systems will be treated as a severe Assessment Offence.
Coursework Submission Requirements
· An electronic copy of your work for this coursework must be fully uploaded on the Deadline Date of 25th April 2022 using the link on the coursework Moodle page for COMP1702.
· For this coursework
you must submit a single report in PDF format
. In general, any text in the document must not be an image (i.e. must not be scanned) and would normally be generated from other documents (e.g. MS Office using “Save As .. PDF”). An exception to this is handwritten mathematical notation, but when scanning do ensure the file size is not excessive.
· There are limits on the file size (see the relevant course Moodle page).
· Make sure that any files you upload are virus-free and not protected by a password or corrupted otherwise they will be treated as null submissions.
· Your work will not be printed in colour. Please ensure that any pages with colour are acceptable when printed in Black and White.
· You must NOT submit a paper copy of this coursework.
· All coursework must be submitted as above. Under no circumstances can they be accepted by academic staff
The University website has details of the current Coursework Regulations, including details of penalties for late submission, procedures for Extenuating Circumstances,
(
1
)
and penalties for Assessment Offences. See http://www2.gre.ac.uk/current- students/regs
Detailed Specification
You are expected to work individually and complete a report that addresses the following tasks. You need to cite all sources yourely on with in-text style. You may include material discussed in the lectures or labs, but additional credit will be given for independent research. Note: References should be in Harvard format. The word count does NOT include references.
· Part A (25 Marks)
· Task A.1 [mark 10] Explain the main characteristics of Big Data. (Word count: 200 words ±10%)
· Task A.2 [mark 15] Compare Hadoop and Relational Database Systems. Give an application scenario that is well suited to Hadoop and explain your reason. (Word count: 300 words ±10%)
· Part B (30 Marks): MapReduce Programming
Suppose that you have a large student file which cannot be stored in a single machine. Each record of this file contains information: (Student_ID, Student_Name, Sex, Age, Module, Grade, Department).
· Task B.1 [mark 15] Please design a MapReduce Algorithm (Pseudo-codes or Java Codes) to output the average grade for each module. The algorithm is expected to be as efficient as possible.
· Task B.2 [mark 15] Describe the algorithm designed. You should explain how the input is mapped into (key, value) pairs by the map stage, i.e., specify what is the key and what is the associated value in each pair, and, if needed, how the key(s) and value(s) are computed. Then you should explain how the output (key, value) pairs of the map stage are processed by the reduce stage to
getthefinalanswer(s). You should also analyse the efficiency of the MapReduce algorithm designed. (Word count: 300 words ±10%)
· Part C (45 marks): Big Data Project Analysis
The CropY company is a leading provider of precision agriculture service. Precision agriculture is the science of gathering, processing, and analysing temporal, spatial and individual data. It combines other information to support management decisions according to estimated variability for improved resource use efficiency, productivity, quality, profitability.
The CropY company is now plan to develop a big data project to meet the following requirements: help worldwide users better understanding the implications of the weather and making contingency plans; buying supplies, such as fertilizer and seeds; as well as maintaining and monitoring the quality of yield, whether livestock or crops; knowing the variety of cultivated plants, conditions of its growth and its needs of seeds; choosing the type of fertilizer and pesticides, understanding their employment conditions and their impact on the climate- soil-plant; recognizing daily water needs for each kind of plant; calculating the median and mean values of yield; studying the conditions of natural environment; estimating the financial revenue and manage the potential risks.
· Task C.1 [mark 10]: The volume of big data is expected to be more than 500 Petabytes. The data will come from various sensors, satellites, drones, social media, market data, Online news feed etc. The Figure 1 below shows some example data of CropY company. Some IT technician plan to build a data warehouse to store data for further data analysis tasks but some others believe data lake is a better choice. Which choice do you prefer? Please justify your choice. (Word count: 300 words ±10%)
Figure 1. Example Data of CropY Company
· Task C.2 [mark 10]: The data of CropY company includes a large collection of plants, corps, diseases, symptoms, pests, and relationships between them. The CropY company needs to build a data analytical store which can facilitate queries like: “find all diseases which are directly or indirectly caused by nitrogen
deficiency”. Please recommend a data store and justify your choice. (Word
count: 300 words ±10%)
· Task C.3 [mark 15]: Some prediction and analytics services provided bytheCropY company require to response in a few seconds after the arrival of new data. Namely, they are real time or near real time prediction and analytics tasks. Some IT managers suggested a popular distributed processing framework — MapReduce to implement these tasks. Do you agree with that? Please justify your choice. (Word count: 300 words ±10%)
· Task C.4 [mark 10]: CropY company decided to move most of applications and services to cloud. These applications and services need to be highly available, scalable, and accessible from worldwide. Note that some data such as price and customer data are confidential. Please design a cloud hosting strategy for this big data project and explain how your design will meet the security, scalability, high availability. (Word count: 300 words ±10%)
Grading Criteria
Grade 80-100% Exceptional
Clear evidence of research
Excellent quality and innovation with total control of all relevant material. Demonstrate outstanding insight and an ability to structure and synthesise material.
Demonstrates an excellent Understanding of the material and issues
Relevant use of referencing and examples. The reference is complete and precise. Expression/style/grammar outstanding.
Grade 70-79% Excellent
Clear evidence of research
Able to criticise and evaluate material.
Demonstrate good insight and an ability to structure and synthesise material Demonstrates a good
understanding of the material and issues Professional standard of report
The reference is nearly complete and precise.
Grade 60-69% Very Good
Evidence of adequate research
Meets the essential functional requirements
The design uses the appropriate frameworks but may have errors. Acceptable standard of report The references are basically satisfactory.
Grade 50-59% Good
A partial response to the question
Little sustained attempt to develop a coherent answer limited reading
The evidence may be misremembered, vague or insufficient to constitute a serious response Containing errors of fact or interpretation
The references are NOT enough.
Grade <50% Fail Few requirements met Poor standard of report Does not demonstrate self-direction or originality in problem solving or a critical self-evaluation of the project process No (or wrong) References

Place your order
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more