SophiaPretty

(5)

$14/per page/Negotiable

About SophiaPretty

Levels Tought:
Elementary,Middle School,High School,College,University,PHD

Expertise:
Accounting,Algebra See all
Accounting,Algebra,Applied Sciences,Architecture and Design,Art & Design,Biology,Business & Finance,Calculus,Chemistry,Communications,Computer Science,Economics,Engineering,English,Environmental science,Essay writing Hide all
Teaching Since: Jul 2017
Last Sign in: 304 Weeks Ago, 5 Days Ago
Questions Answered: 15833
Tutorials Posted: 15827

Education

  • MBA,PHD, Juris Doctor
    Strayer,Devery,Harvard University
    Mar-1995 - Mar-2002

Experience

  • Manager Planning
    WalMart
    Mar-2001 - Feb-2009

Category > Computer Science Posted 12 Dec 2017 My Price 10.00

final project consists of a recommendation system.

Overview:

My final project consists of a recommendation system. The recommender system will predict the ratings that the user will give to the movies he/she has not rated yet. The output of your recommender system should be a file that contains the top 5 movie recommendations for every user in a descending order of their predicted ratings (i.e., the first movie recommendation should have the highest predicted rating). The format of the output file should be as follows: 

UserID MovieTitle1::predicted rating | MovieTitle2::predicted rating | MovieTitle3::predicted rating | MovieTitle4::predicted rating | MovieTitle5::predicted rating 

 

For example,

user ID: 1 top 5 recommendations: Cyclo (1995)::4.382758930148253| Office Killer (1997)::4.24082725472752| Little City (1998)::4.234925942215377| Death in Brunswick (1991)::4.224473123463171| Mamma Roma (1962)::4.178130372636395| 

This means that the top 5 movie recommendations for user ID 1 are: cyclo (1995) with predicted rating 4.38, office killer (1997) with predicted rating 4.2, etc. 

 

Quick Things:

  1. I may not hard-code the number of movies and the number of users. This information must be read from the input file.
  2. The data structure I use should be efficient both in terms of access time and memory usage. A simple 2D array for storing ratings will have a quick access time to each rating but it is very space inefficient because it wastes a lot of memory for unrated movies and hence, it is not an appropriate data structure to store ratings .
  3. My code should compute the similarity between every pair of movies and store the similarity values in a 2D array. 
  4. For each user u,
  5. Find the set of movies that u has already rated
  6. For each movie i that u has not rated yet:
  7. Use formula to predict rating that user u will give to movie i, based on the ratings that u has previously given to other movies.
  8. Find the top-5 movies with the highest predict rating for u. (I don't need to sort the entire array of predicted ratings for u to get the top 5)

 

 

Theoretically speaking, my final project should already compile but it isn't. I'm getting the following when running it on Netbeans:

Snip20170501_78.png

I'm attaching:

  1. movies.dat: Each line in the movies.dat represents a unique movie and has the following format: 

movie id | movie title | release date | video release date | etc.

2. ratings.dat: All ratings are contained in the file ratings.dat. Each line of ratings.dat has the following format ( the fields are separated by tab): 

UserID MovieID Rating Timestamp 

3. Recommend.java : my code. I suspect it's the path that maybe isn't working...

 

Where:

-UserIDs: the id of the user who gave rating

- MovieIDs: the id of the movie for which the user gave rating

- Ratings: a number in the scale (1-5) given to movie_ID by user_ID. 

 

I need help:

 

  1. I need to make it compile, and explaining the data structures used to store movie information and ratings and the algorithm used to compute the top 5 recommendation. 
  2. If my analysis of the order of magnitude efficiency (big –Oh) of the algorithm in terms of the number of users, and movies, is correct.

Efficiency:

  • O(users * movies2)
  • parseRatingsFile(): O(users * movies)
  • parseMovieFile(): O(movies)
  • similarity(): O(users * movies2)
  • for loop that iterates through users: O(users)predictedRatings(): O(movies)
  • getTop5(): O(movies)

 

 

Please respond with the corrected code and an explanation to make sense of it.

 

Thank you, I appreciate your help.

  • import java.io.BufferedWriter;
    import java.io.File;
    import java.io.FileWriter;
    import java.util.HashMap;
    import java.util.Scanner;


    public class Recommend {
       
        public static Object[] parseRatingsFile() {
            HashMap userRows = new HashMap();
            HashMap movieCols = new HashMap();
           
            int[][] ratings = null;
           
            try {
                Scanner ratingsDat = new Scanner(new File("src/ratings_small_example.dat"));
               
                // add one to each number, so user and movie IDs are one-indexed
                // using a zero-index would be too confusing
                // as a result, row 0 and column 0 will be empty
                int numUsers = 943 + 1;
                int numMovies = 1682 + 1;
                ratings = new int[numUsers][numMovies];
               
                while (ratingsDat.hasNextLine()) {               
                    // get user ID, movie ID, and rating from each line
                    String userIdStr = ratingsDat.next();
                    String movieIdStr = ratingsDat.next();
                    int rating = ratingsDat.nextInt();
                   
                    int userHashCode = 0;
                    try {
                        // if user ID is numeric
                        // matrix row index is set to user ID
                        userHashCode = Integer.parseInt(userIdStr);
                    }
            // if user ID contains non-numeric characters
            catch (NumberFormatException nfe) {
                        // strings are mapped to the sum of its characters' ASCII codes,
                        // multiplied by character's position
                        for (int i = 0; i < userIdStr.length(); i++) {
                            int asciiValue = (i + 1) * (int) userIdStr.charAt(i);
                            userHashCode += asciiValue;
                        }
                    }
                   
                    // call find key method
                    userHashCode = findKey(userRows, userHashCode, userIdStr, 943);
                   
                    int movieHashCode = 0;               
                    try {
                        // if movie ID is numeric
                        // matrix row index is set to movie ID
                        movieHashCode = Integer.parseInt(movieIdStr);
                    }
            // if movie ID contains non-numeric characters
            catch (NumberFormatException nfe) {
                        // strings are mapped to the sum of its characters' ASCII codes,
                        // multiplied by character's position
                        for (int i = 0; i < movieIdStr.length(); i++) {
                            int asciiValue = (i + 1) * (int) movieIdStr.charAt(i);
                            movieHashCode += asciiValue;
                        }
                    }
                   
                    // call find key method                       
                    movieHashCode = findKey(movieCols, movieHashCode, movieIdStr, 1682);
                           
                    // store user's rating at row (based on user IDs) and column (based on movie IDs)
                    ratings[userHashCode][movieHashCode] = rating;
                   
                    // go to next line of ratings_small_example.dat
                    ratingsDat.nextLine();
                }           
               
                ratingsDat.close();
            } catch (Exception e) {
                e.printStackTrace();
            }
           
            return new Object[]{ ratings, userRows, movieCols };       
        }
       
        public static int findKey(HashMap map, int rowCol, String id, int max) {
            // if the hash code is larger than the largest possible matrix row/column index
            if (rowCol > max) {
                // use modulo to convert hash code to valid range
                rowCol %= max;
            }
                   
            if (map.containsKey(rowCol)) {
                // if the row/column is already mapped to the right ID, do nothing
                if (map.get(rowCol).equals(id)) {
                    return rowCol;
                }
               
                // if there is collision
                else {
                    int j = 1;
                   
                    // resolve collision using quadratic probing
                   
                    // while the current row is occupied by a user ID
                    while (map.containsKey(rowCol)) {
                        // if previous probing has already mapped this row to the user ID,
                        // break out of while loop
                        if (map.get(rowCol).equals(id)) {
                          break;
                        }
                         
                        // quadratic probing
                        rowCol += (j * j);
                        j++;
                         
                        // if userRow is larger than the largest possible matrix row/column index,
                        // reset it to 0
                        if (rowCol > max) {
                          rowCol = 0;
                        }
                    }
                }
            }
           
            // if no collision, method will skip if statement and go directly here
            // place user ID and corresponding row into hashmap
            map.put(rowCol, id);
             
            return rowCol;
        }
       
        public static HashMap parseMovieFile() {
            HashMap moviesMap = new HashMap();
                   
            try {
                Scanner moviesDat = new Scanner(new File("src/movies_small_example.dat"));           
                           
                while (moviesDat.hasNextLine()) {
                    // stores information on a line in this string
                    String line = moviesDat.nextLine();
                   
                    // splits each line into different elements based on "|" delimiter
                    // consecutive "|", ie "||", are treated as 1 delimiter
                    // stores these elements into array
                    String[] info = line.split("[|]+");
                   
                    // movie ID is used to place movie name in the movies hashmap
                    String movieIdStr = info[0];
                   
                    // movie name is 2nd element
                    String movieName = info[1];
                   
                    // store movie ID and name into hashmap
                    moviesMap.put(movieIdStr, movieName);   
                }       
            } catch (Exception e) {
                e.printStackTrace();
            }
           
            return moviesMap;       
        }
       
        public static double[][] similarity(int[][] ratings) {
            // used to store similarity values of each movie (one-indexed)
            double[][] similarities = new double[1683][1683];
           
            // when 2 movies are the same movie, set the similarity value to 1
            // ratings[0].length = number of movies
            for (int movie = 1; movie < ratings[0].length; movie++) {
                similarities[movie][movie] = 1;
            }
           
            // compares each movie to every other movie
            for (int movie1 = 1; movie1 < ratings[0].length; movie1++) {
                // only compares movies to movies with greater movie IDs
                // b/c movies with smaller IDs have already been compared
                for (int movie2 = movie1 + 1; movie2 < ratings[0].length; movie2++) {
                    // gets similarity between the 2 movies
                    double similarity = getSimilarityBetween(ratings, movie1, movie2);
                   
                    /* taking advantage of the symmetry of pair-wise similarity,
                     * stores similarity value in 2 places: (i, j) and (j, i) 
                     */
                    similarities[movie1][movie2] = similarity;
                    similarities[movie2][movie1] = similarity;
                }
            }
           
            return similarities;
        }
       
        public static double getSimilarityBetween(int[][] ratings, int movie1, int movie2) {
            double numerator = 0;
            double movie1Denominator = 0;
            double movie2Denominator = 0;
           
            /* uses equation 2 from project spec
             * calculates numerator and denominator separately first
             */
            for (int user = 1; user < ratings.length; user++) {
                // takes the ratings a certain user has given for the 2 movies being compared,
                // multiplies them, and adds the product to the numerator
                int movie1Rating = ratings[user][movie1];
                int movie2Rating = ratings[user][movie2];
                numerator += (movie1Rating * movie2Rating);
               
                // squares each rating and adds it to each movie's part of the denominator
                movie1Denominator += (movie1Rating * movie1Rating);
                movie2Denominator += (movie2Rating * movie2Rating);           
            }
           
            // gets square root of each movie's part of the denominator
            movie1Denominator = Math.sqrt(movie1Denominator);
            movie2Denominator = Math.sqrt(movie2Denominator);
           
            // calculates similarity using numerator and denominator
            double similarity = numerator / (movie1Denominator * movie2Denominator);
           
            return similarity;
        }
       
        public static double[] predictedRatings(int[][] ratings, int userIndex, double[][] similarities) {
            // array for predicted ratings
            // length based on number of movies + 1 (to make it one-indexed)
            double[] predictedRatings = new double[ratings[userIndex].length];       
           
            for (int movie = 1; movie < ratings[userIndex].length; movie++) {
                // if there is no rating for a movie
                if (ratings[userIndex][movie] == 0) {
                    int notRatedMovie = movie;
                    double predictedRating = 0;
                   
                    double count = 0;               
                    for (int ratedMovie = 1; ratedMovie < ratings[userIndex].length; ratedMovie++) {
                        // for all the movies that the user has rated
                        // get the similarity between those movies and the unrated movie
                        if (ratings[userIndex][ratedMovie] != 0) {
                            // formula 1 part 1
                            // rating is weighted by similarity
                            int rating = ratings[userIndex][ratedMovie];
                            double similarity = similarities[ratedMovie][notRatedMovie];
                            double weightedAverage = rating * similarity;
                            predictedRating += weightedAverage;
                            count += similarity;
                        }
                    }
                   
                    // formula 1 part 2
                    predictedRating /= count;
                   
                    // store predicted rating in the array
                    predictedRatings[movie] = predictedRating;
                }
            }
           
            return predictedRatings;       
        }
       
        public static Object[] getTop5(double[] predictedRatings) {
            // initializes top 5 positions to first 5 elements of predictedRatings
            int[] top5Positions = {1, 2, 3, 4, 5};       
           
            // gets the ratings at the first 5 positions in predictedRatings and stores them in a parallel array
            double[] top5Ratings = new double[5];   
            for (int i = 0; i < top5Ratings.length; i++) {
                top5Ratings[i] = predictedRatings[top5Positions[i]];
            }
           
            // finds the lowest of the top 5 values
            int lowestRatingIndex = findLowestIndex(top5Ratings);
           
            // iterates through the rest of the predicted ratings array
            for (int i = 6; i < predictedRatings.length; i++) {           
                // if a certain element is larger than the lowest of the 5 top values
                if (predictedRatings[i] > top5Ratings[lowestRatingIndex]) {
                    // removes the current lowest element and replaces it with the
                    // larger element in top5Ratings
                    top5Ratings[lowestRatingIndex] = predictedRatings[i];
                   
                    // updates top5Positions with index of larger element
                    top5Positions[lowestRatingIndex] = i;
                   
                    // recalculates the lowest value of top5Ratings
                    lowestRatingIndex = findLowestIndex(top5Ratings);   
                }
            }
           
            // bubble sort for sorting the final top 5 values and positions in descending order
            for (int first = 0; first < top5Ratings.length; first++) {
                for (int second = first + 1; second < top5Ratings.length; second++) {
                    if (top5Ratings[first] < top5Ratings[second]) {
                        // reorder the values
                        double valueTemp = top5Ratings[first];
                        top5Ratings[first] = top5Ratings[second];
                        top5Ratings[second] = valueTemp;
                       
                        // reorder the positions of the values
                        int positionTemp = top5Positions[first];
                        top5Positions[first] = top5Positions[second];
                        top5Positions[second] = positionTemp;
                    }
                }
            }
           
            return new Object[]{top5Positions, top5Ratings};       
        }
       
        public static int findLowestIndex(double[] values) {
            int lowestRatingIndex = 0;
           
            /*
             * Iterates through 5 element array of values
             * Compares value at the lowest index with each element
             * If value of a certain element is smaller than the lowest value,
             * makes that element's position the new lowest value index
             */
            for (int counter = 0; counter < values.length; counter++) {
                if (values[lowestRatingIndex] > values[counter]) {
                    lowestRatingIndex = counter;
                }
            }
           
            return lowestRatingIndex;
        }

        public static void main(String[] args) {
            // start time (to measure running time of algorithm)
            long start = System.currentTimeMillis();
           
            System.out.println("Parsing ratings file...");
            Object[] ratingsFileOutput = parseRatingsFile();
            int[][] ratings = (int[][]) ratingsFileOutput[0];
            HashMap userRows = (HashMap) ratingsFileOutput[1];
            HashMap movieCols = (HashMap) ratingsFileOutput[2];
           
            System.out.println("Parsing movies file...");
            HashMap movies = parseMovieFile();
           
            System.out.println("Calculating similarities...");
            // get similarities between all movies for later referencing
            double[][] similarities = similarity(ratings);
           
            try {
                // create output file for top 5 recommendations for each user
                File recommendations = new File("recommendations.txt");
                BufferedWriter output = new BufferedWriter(new FileWriter(recommendations));
                           
                System.out.println("Starting output to file...");
               
                // iterates through users
                for (int userIndex = 1; userIndex < ratings.length; userIndex++) {
                   
                    // for each user,
                    // calculates their predicted ratings and gets the top 5
                    double[] predictedRatings = predictedRatings(ratings, userIndex, similarities);
                    Object[] top5 = getTop5(predictedRatings);
                    int[] top5Positions = (int[]) top5[0];
                    double[] top5Ratings = (double[]) top5[1];
                   
                    // writes user ID from userRows hashmap
                    output.write("user ID: " + userRows.get(userIndex) + " top 5 recommendations:");               
                   
                    // writes top 5 recommendations (movie name & predicted rating)
                    for (int movie = 0; movie < top5Positions.length; movie++) {
                        output.write(" " +
                                // gets movie name from movies hashmap
                                movies.get(String.valueOf(top5Positions[movie])) + "::" +
                                // predicted rating
                                top5Ratings[movie] + "|");
                    }
                    output.write("\n");
                }
               
                // records end time
                long end = System.currentTimeMillis();
                System.out.println("End output...");
                // calculates how long the algorithm took in milliseconds
                System.out.println(end - start + " ms");
                output.close();
               
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

    }

Answers

(5)
Status NEW Posted 12 Dec 2017 12:12 PM My Price 10.00

-----------  ----------- H-----------ell-----------o S-----------ir/-----------Mad-----------am ----------- Th-----------ank----------- yo-----------u f-----------or -----------you-----------r i-----------nte-----------res-----------t a-----------nd -----------buy-----------ing----------- my----------- po-----------ste-----------d s-----------olu-----------tio-----------n. -----------Ple-----------ase----------- pi-----------ng -----------me -----------on -----------cha-----------t I----------- am----------- on-----------lin-----------e o-----------r i-----------nbo-----------x m-----------e a----------- me-----------ssa-----------ge -----------I w-----------ill----------- be----------- qu-----------ick-----------ly

Not Rated(0)