Recognise and critically address the ethical, legal, social

Post New Homework

COMP1804 Applied Machine Learning

Learning Outcome 1: Rationalise appropriate scenarios for Machine Learning applications and evaluate the choice of machine learning methods for given application requirements.

Learning Outcome 2: Demonstrate competency in using appropriate libraries/toolkits to solve given real- world Machine Learning problems and develop and evaluate suitable application.

Learning Outcome 3: Understand and apply the relevant input data preparation and processing required for the Machine Learning models used, and quantitatively evaluate and qualitatively interpret the learning outcome.

Learning Outcome 4: Recognise and critically address the ethical, legal, social and professional issues that can arise when applying Machine Learning technologies.

Sub-task 1: Text Classification/regression - peer reviews.

This task is to implement a ML solution for text classification/regression (long texts). It uses a dataset of ML paper peer reviews from the International Conference of Learning Representation (in the years between 2017 and 2020) [1,2].
Specifically, you will use as input a text document concatenating: the title of the paper, the abstract of the paper, the review comments, the final acceptance/rejection comment. Such input should be used to predict the following attributes:
• Acceptance status (‘Accept' or ‘Reject')
• Review score (Integer number between 1 and 10).
Note that for the latter attribute you can choose whether to use multiclass classification or regression. You can choose whether to predict both features simultaneously or separately.

Additionally, the dataset is provided with a further attribute, the reviewer confidence score (an integer number between 1 and 5), which is optional to use. If you want to explore the data further, a separate dataset with the text field split into the original fields "review comments", "paper title", "paper abstract" and "final acceptance/rejection comment" can be provided upon request.

Sub-task 2: Image classification - skin lesions.

This task is to implement a ML solution for a classification problem from images. Specifically, you are provided with images of skin lesions [3] and your task is to correctly predict the following attributes:
• Whether a skin lesion is benign or malign (1 for ‘is_benign', 0 for ‘is_malign')
• The fine-grained diagnosis for the skin lesion (7 possible categories).
You can choose whether to predict both features simultaneously or separately. Additionally, the dataset is provided with a further attribute, the location of the skin lesion (for example, "scalp"), which is optional to use. If you want to explore the data further, a separate dataset with more attributes can be provided upon requests. The dataset has been adapted to the requirements of this module; the original dataset was released under the terms of the CC BY-NC 4.0 licence by Tschandl et al. [3].

Sub-task 3: Image classification - advertisements.

This task is to implement a ML solution for a classification problem from images. Specifically, you are provided with images of advertisements [4] and your task is to correctly predict the topic of each advertisement.
• Images are of different sizes and there are 39 possible topic categories.
• You may choose to group together some of the categories (keeping no less than 12 categories). You should thoroughly discuss (and will be evaluated on) the reasons behind and the implications of grouping together different categories.

Sub-task 4: Text classification - amazon reviews.

This task is to implement a ML solution for a multi-task classification problem from text data (mostly short texts). Specifically, you are provided with Amazon reviews [5] (the text is the review title and the review main body joined together) and your task is to predict the following attributes:
• The number of stars associated with the review (on a scale of 1 to 5).
• Whether a product is from the category "Video Games" ("video_games") or "Musical Instrument" ("musical_instrument").
Note that for the first attribute you can choose whether to use multiclass classification or regression. You can choose whether to predict both features simultaneously or separately.

Additionally, the dataset is provided with a further attribute: whether the review is verified or not (either True or False), which is optional to use. If you want to explore the data further, a separate dataset with the text field split into the original fields "review title", "review main body" can be provided upon request.

Tasks:

1. Practical Assignment (complete code that is executed without errors). The source code must be well documented and error free (i.e. no debugging necessary to run). For each dataset, the assignment includes:
o Exploratory Data Analysis (e.g. label distributions per attribute and per set).
o Data cleaning.
o Data Splitting (in training and test sets, but see below) and Data Pre-processing (where appropriate: normalization/standardization, data augmentation, over/under-sampling, text processing).
o 2 ML Methodologies (a basic one & an additional one): appropriate ML methods should be used that have coherent implementations and sound pipelines, without any errors; (if the basic ML method is a Neural Network, the additional one can be another Neural Network).
o Systematic experimentation: you should choose one parameter/attribute to change for each ML methodology (the attribute/parameter can be the same or different across the two methodologies) and show how it affects the results using clear and well formatted figures and tables. Bonus points are given for experimenting using a validation dataset.
o Evaluation of the 2 methods using at least 2 metrics and showing 3-10 examples from the test dataset.

2. Written Report:
• Document in IEEE conference format. Use template available on Moodle or on Overleaf (make a copy of the Overleaf template).
• Should include references (citing other work) where appropriate (when images, data, code, or any other resources have been used from other sources)
• Document structure:
o Abstract: Briefly summarise what the report contains. That is: the task you are solving and why it is important; the outline of the ML methods you implemented and the systematic experimentation performed; the summary of your results and your conclusions. The abstract should be between 100 and 200 words.
o Introduction and related work: This section should talk about the following:
• The problem to be solved, why and to whom it matters, why it is challenging.
• Existing work related to your chosen task (it can be about the exact same task or a similar one).
• A brief overview of the dataset and the data pre-processing steps implemented.
• Your chosen ML implementations and a brief overview about why they are appropriate.
• What your systematic experiment is.
o Ethical discussion: Identify and discuss some of the social, ethical and legal implications of your chosen task, from data collection and processing to the ML prediction. The discussion should take into account communities and people that may be affected by the ML system.
o Dataset preparation: Describe exploratory data analysis, data cleaning, splitting and pre-processing and the reasons behind your design choices.

o ML methods: Describe and explain the 2 methods used and the reasons behind your design choices.
o Experiments and evaluation. Describe the systematic experimentation implemented for each ML method. Based on the experiments, evaluate, present, analyse and explain method performance and metrics used (why are the metrics appropriate?).
o Discussion and future work: Reflections on a) what worked well and what worked less well; b) reasons behind the performance obtained; c) how your work could be extended in the future and what addition can be made to it.
o Conclusions: A brief summary of the work done and what the main highlights were.
o References: All existing works and resources (code/images/etc) you used or talked about in your report must be cited properly.

Attachment:- Applied Machine Learning.rar

Post New Homework
Captcha

Looking tutor’s service for getting help in UK studies or college assignments? Order Now