CN7030 Machine Learning on Big Data, University of East, UK

CN7030 Machine Learning on Big Data, University of East

UK Tutor Service

Machine Learning on Big Data

Big Data Analytics using ML and Streaming methods

Big Data Analytics using PySpark

Develop one multi-class classifier and one clustering.
Explain the features and configurations you wish to apply.

Evaluate and visualize the accuracy/performance and the working solution for each method you applied.

Data Streaming analytics using PySpark

Complete two tasks for data streaming analytics. You should put the screenshot of the working solution in the
report.

Documentation Write down a scientific report.

Implementation Project

Task 1

Find a data set involving an interesting sequence of symbols: perhaps text, color sequences in images, or event logs from some device. Use word2vec to construct symbol embeddings from them, and explore through nearest neighbor analysis.
What interesting structures do the embeddings capture?

Task 2

Experiment with different discounting methods estimating the frequency of words in English. In particular, evaluate the degree to which frequencies on short text files (1000 words, 10,000 words, 100,000 words, and 1,000,000 words) reflect the frequencies over some large text corpora, say, 10,000,000 words.

Tip: You can use the interesting YouTube Video - Mining Big Data with Apache

SparkURL from Week 2 as the example of implementation on these types of ML modelling.

Implementation Presentation

The Presentation Part is a Good presentation based on the Report you will produce. Please follow the marking scheme so you will know how your presentation should be presented.

Excel in your Course/Assignments

UK Tutor service is helping students, not just for improving grades but also to provide them better learning of subject and concepts behind the solutions of problems. They are serving world class Live assistance which may help you to excel in course.

Study in UK – United Kingdom
Search solved assignments/homework

Management

Accounting

Economics

Finance

Statistics

Engineering

Computers

Programming

Essays

Papers

English

MATH

Physics

Chemistry

Biology

CN7030 Machine Learning on Big Data, University of East

Excel in your Course/Assignments

Post Feedback

Tutor service in UK?

Popular Tags