Data Mining: Bookstore Recommendation Project

Source: Slideplayer
I wrote a brief introduction about this project here, be sure to check it out. Today, I will be sharing the steps we took in solving this team project, the algorithm we used and finally the outcome of the project. I love to keep my articles short, simple and straight to the point so if there are few gaps or questions. You can drop them in the comment section or contact me. I will try to provide a solution if I can. Otherwise let's get right into it.

The task was to recommend set of books to promote together for a specific customer group. Also, to give recommendations to the CEO so that he/she can make the right decision with the aim to improve decision making and stay profitable. The good thing about this project is that the sample dataset was given to us so all we needed to do was work with the data and provide the necessary recommendations. 

WEKA was the recommended software we were asked to use to solve this problem. I will be breaking down the steps we used to solve this problem in WEKA  accompanied with screenshots of our outputs.

1) We launched WEKA application 
2) We imported the given dataset. Note: The dataset we were given was in ARFF format
3) The first thing to note in this kind of problem is that, it is a recommendation problem which obviously streamlined our decision on the algorithm to use.
4) We decided to use Association mining rule. We applied the Apriori Algorithm. Why? Simply because this rule/algorithm are known to be used for finding relationships between frequent itemsets, correlations and associations. 
5) After deciding on the rule that was applicable to our project, we needed to be sure there were no missing data in our dataset and also know the type of data that we were given which is called pre-processing stage.

A. Pre-processing Stage
1) Our data had 599 instances with 12 attributes. 
2) The data was initially a numeric (REAL) data type. However, in order to be able to apply Apriori algorithm, the data type was converted to nominal using 'NumericToNominal' unsupervised filter in WEKA.

Raw Data Visualization in WEKA
Data after Conversion in WEKA
3) The 'ID Transaction' attribute was removed simply because it does not add any value to the data mining approach. 
4) At the end of the pre-processing stage, the dataset consisted of  11 attributes with either 0s or 1s. Where 0 indicated that the item was not bought and 1 indicated that an item was bought.

B. Analysis Stage
1) In this stage, the Apriori algorithm was applied on the dataset. However, we discovered that WEKA built the model based on only unpurchased items πŸ˜• which was not our intention. Our aim is to give recommendations based on purchased items or 1s. How then we do we move forward from hereπŸ˜–
2) WEKA of course has a feature to solve this which was what we applied and voila we got some juicy outputs to work withπŸ˜‹. What then is this feature?
3) Well, in the Apriori algorithm settings there is a feature called "treatZeroAsMissing" which by default is set to "False" so we set this feature to "True" and yes πŸ’ͺ that was it.


4) We reran the algorithm but no best rules were found at the default 'minMetric' of 0.9, which indicated that no best rules were found at a 90% confidence.


5) However, we reduced the 'minMetric' to 0.8, 0.7 and even 0.6.and we were able to get some really good combinations which we used for our recommendation/ solution to the problem.

Output at 0.8 confidence level (3 best rules found)
Output at 0.7 confidence level (10 best rules found)

Output at 0.6 confidence level (10 best rules found)

C. Analysis of Results
1) Each of the rules that we found contained ‘A=>C’ which means that if a set of antecedents (A) are purchased, then there is a probability that Consequent (C) will also be purchased. For example, for the output at 0.8 confidence level, 78 transactions contained purchase of a Youthbook and a Cookbook (A). Out of those transactions, 67 instances contained a Childbook (C). The latter part is referred to as ‘Support’ for the Consequent. The Confidence score shows how confident the association rule is, given the dataset. It is calculated as: C/A: 67 / 78 = 0.86.

2) Another interesting parameter is the ‘Lift’ which is defined as how likely it is to have all antecedents and consequent in one single transaction in comparison to the entire transaction dataset. Basically the larger the lift ratio, the more significant the association of the itemset. In order to calculate Lift, first we needed to figure out the ‘Expected Confidence’ which is the probability of the purchase of consequent regardless of the antecedents. As an example, looking at the first rule (0.8 confidence), the total number of transactions containing Childbook (250) divided by the total number of transactions (599): 250/ 599 = 0.417362270 (approximately 0.42)

3) After calculating the Expected Confidence, the Lift can then be calculated. This is the ratio of the confidence and the expected confidence: 0.86 / 0.417362270 = 2.06

4)With the confidence score of 86% and the lift score of 2.06, this rule can be considered as a strong association. That is just the analysis of one rule. I wouldn't be going through all the analysis of all the rules in this article.

5) After building the model, these three best rules were found by Weka:
  • If a Youthbook and a Cookbook are purchased in one transaction, there is 86% confidence that a Childbook will be purchased
  • If a Cookbook and a Refbook are purchased in one transaction, there is 83% confidence that Childbook will be purchased
  • If a Cookbook and a Geogbook are purchased in one transaction, there is 82% confidence that Childbook will be purchase

D. Our Recommendation
Based on our analysis and the results from WEKA, the decision/business model that we would recommend is that: since Childbook has a relatively high correlation with Youthbooks, Cookbooks, Refbooks and GeogBooks, then they can be promoted together.


In conclusion, when trying to solve a data mining problem. There are several ways to go about it. There are also several ways to interpret your results after your analysis. However, I would recommend that you understand the basis behind WEKA if you are not familiar with it. This will give you a better understanding of whatever project you are given and diferent ways to go about it.

I hope this article helps someone out there trying to get a hang of a similar project. Below are the list of some useful links that were very useful for us while we were solving this problem.
1) Building a market basket model
2) Market Basket Analysis with Association Rule Learning
3) Lift in Association Rule 


Until next time...πŸ’‹



74 comments

  1. Well, The information which you posted here is very helpful & it is very useful for the needy like me.., Wonderful information you posted here. Thank you so much for helping me out to find the Data analytics course in Mumbai
    Organisations and introducing reputed stalwarts in the industry dealing with data analyzing & assorting it in a structured and precise manner. Keep up the good work. Looking forward to view more from you.

    ReplyDelete
  2. Attend The Data Science Courses in Bangalore From ExcelR. Practical Data Science Courses in Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Courses in Bangalore.
    ExcelR Data Science Course Bangalore

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete

  5. Excelr is providing emerging & trending technology training, such as for data science, Machine learning, Artificial Intelligence, AWS, Tableau, Digital Marketing. Excelr is standing as a leader in providing quality training on top demanding technologies in 2019. Excelr`s versatile training is making a huge difference all across the globe. Enable ?business analytics? skills in you, and the trainers who were delivering training on these are industry stalwarts. Get certification on "
    data science training institute in hyderabad
    "
    and get trained with Excelr.

    ReplyDelete
  6. Such a very useful Blog. Very interesting to read this article. I have learn some new information.thanks for sharing. know more about

    ReplyDelete
  7. Cool stuff you have and you keep overhaul every one of us.
    excelr data science

    ReplyDelete
  8. Awesome..I read this post so nice and very imformative information...thanks for sharing
    Click here for data science course

    ReplyDelete
  9. Cool stuff you have and you keep overhaul every one of us.
    excelr data science

    ReplyDelete

  10. Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more. click here to know Excelr PMP

    ReplyDelete
  11. This comment has been removed by a blog administrator.

    ReplyDelete
  12. Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.
    ExcelR data science course in mumbai

    ReplyDelete
  13. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.
    data science course in mumbai

    ReplyDelete
  14. It is extremely nice to see the greatest details presented in an easy and understanding manner.
    Please check ExcelR Data Science Certification

    ReplyDelete
  15. This comment has been removed by a blog administrator.

    ReplyDelete
  16. Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing.
    ExcelR Data Analytics Course
    Data Science Interview Questions
    ExcelR Data Science Course

    ReplyDelete
  17. I wanted to thank you for this great read!! I definitely enjoying every little bit of it I have you bookmarked to check out new stuff you post.
    Please check this Machine Learning Course in Pune

    ReplyDelete
  18. I just got to this amazing site not long ago. I was actually captured with the piece of resources you have got here. Big thumbs up for making such wonderful blog page!

    business analytics course

    data analytics courses

    data science interview questions

    data science course in mumbai


    For more info :

    ExcelR - Data Science, Data Analytics, Business Analytics Course Training in Mumbai

    304, 3rd Floor, Pratibha Building. Three Petrol pump, Opposite Manas Tower, LBS Rd, Pakhdi, Thane West, Thane, Maharashtra 400602
    18002122120

    ReplyDelete
  19. This is a wonderful article, Given so much info in it, Thanks for sharing. CodeGnan offers courses in new technologies and makes sure students understand the flow of work from each and every perspective in a Real-Time environmen python training in vijayawada. , data scince training in vijayawada . , java training in vijayawada. ,

    ReplyDelete
  20. keep up the good work. this is an Ossam post. This is to helpful, i have read here all post. i am impressed. thank you. this is our data science training in mumbai
    data science training in mumbai | https://www.excelr.com/data-science-course-training-in-mumbai

    ReplyDelete
  21. Excellent Blog! I would like to thank for the efforts you have made in writing this post. I am hoping the same best work from you in the future as well. I wanted to thank you for this websites! Thanks for sharing. Great websites!

    orthodontist in bangalore

    ReplyDelete
  22. so happy to find good place to many here in the post, the writing is just great, thanks for the post.

    data science course
    360DigiTMG

    ReplyDelete
  23. The information provided on the site is informative. Looking forward more such blogs. Thanks for sharing .
    Artificial Inteligence course in Lucknow
    AI Course in Lucknow

    ReplyDelete
  24. wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries.
    Data science Interview Questions

    ReplyDelete
  25. Attend The Course in Data Analytics From ExcelR. Practical Course in Data Analytics Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Course in Data Analytics.
    Course in Data Analytics
    Data Science Interview Questions

    ReplyDelete
  26. I wanted to leave a little comment to support you and wish you a good continuation. Wishing you the best of luck for all your blogging efforts.
    Know more about Data Analytics

    ReplyDelete
  27. wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries.
    Data Science Course

    ReplyDelete
  28. Thank you a lot for providing individuals with a very spectacular possibility to read critical reviews from this site.
    Python Course in Hyderabad

    ReplyDelete
  29. Awesome blog. I enjoyed reading your articles. This is truly a great read for me. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work!

    data science course

    ReplyDelete
  30. wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries. keep it up.
    data analytics course in Bangalore

    ReplyDelete
  31. Hello, I have browsed most of your posts. This post is probably where I got the most useful information for my research. Thanks for posting, maybe we can see more on this. Are you aware of any other websites on this subject. Data Blending in Tableau

    ReplyDelete
  32. Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing.data science course in jaipur

    ReplyDelete
  33. This comment has been removed by the author.

    ReplyDelete
  34. They seem to cause motion blur and other simple photographic issues that aren't usually seen with professional grade scanners. IVATION 22MP DIGITAL FILM SCANNER

    ReplyDelete
  35. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance

    ReplyDelete
  36. I feel very grateful that I read this. It is very helpful and very informative and I really learned a lot from it.
    Data Science Certification in Bangalore

    ReplyDelete
  37. Awesome blog. I enjoyed reading your articles. This is truly a great read for me. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work!

    Data Science Course

    ReplyDelete
  38. After reading your article I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article.

    Data Science Training

    ReplyDelete
  39. If you don"t mind proceed with this extraordinary work and I anticipate a greater amount of your magnificent blog entries
    data science courses in malaysia

    ReplyDelete
  40. I like viewing web sites which comprehend the price of delivering the excellent useful resource free of charge. I truly adored reading your posting. Thank you!

    Simple Linear Regression

    ReplyDelete
  41. Awesome blog. I enjoyed reading your articles. This is truly a great read for me. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work!
    Data Science Course in Pune
    Data Science Training in Pune

    ReplyDelete
  42. It's really nice and meanful. it's really cool blog. Linking is very useful thing.you have really helped lots of people who visit blog and provide them usefull information.
    Data Science Institute in Bangalore

    ReplyDelete
  43. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression

    ReplyDelete
  44. Glad to chat your blog, I seem to be forward to more reliable articles and I think we all wish to thank so many good articles, blog to share with us.

    Data Science Course

    ReplyDelete
  45. I will really appreciate the writer's choice for choosing this excellent article appropriate to my matter. Here is deep description about the article matter which helped me more.

    Data Science Training

    ReplyDelete
  46. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression
    data science interview questions

    ReplyDelete
  47. I have express a few of the articles on your website now, and I really like your style of blogging. I added it to my favorite’s blog site list and will be checking back soon…
    Data Science Courses Super site! I am Loving it!! Will return once more, Im taking your food likewise, Thanks.

    ReplyDelete
  48. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression
    data science interview questions

    ReplyDelete
  49. I have express a few of the articles on your website now, and I really like your style of blogging. I added it to my favorite’s blog site list and will be checking back soon…
    Data Scientist Courses This is a great inspiring article.I am pretty much pleased with your good work.You put really very helpful information...

    ReplyDelete
  50. This Was An Amazing ! I Haven't Seen This Type of Blog Ever ! Thankyou For Sharing, data science course in hyderabad with placements

    ReplyDelete
  51. This post is really helpful for us. I certainly love this website, keep on it.
    Data Science Course in Hyderabad

    ReplyDelete
  52. Very interesting blog. Many blogs I see these days do not really provide anything that attracts others, but believe me the way you interact is literally awesome.You can also check my articles as well.

    Data Science In Banglore With Placements
    Data Science Course In Bangalore
    Data Science Training In Bangalore
    Best Data Science Courses In Bangalore
    Data Science Institute In Bangalore

    Thank you..

    ReplyDelete
  53. Excellent Blog! I would like to thank for the efforts you have made in writing this post. I am hoping the same best work from you in the future as well. I wanted to thank you for this websites! Thanks for sharing. Great websites!

    Simple Linear Regression

    Correlation vs Covariance

    ReplyDelete
  54. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. machine learning projects for final year In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.


    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.

    ReplyDelete
  55. I just got to this amazing site not long ago. I was actually captured with the piece of resources you have got here. Big thumbs up for making such wonderful blog page!
    data science institute in hyderabad
    data science training
    data science course

    ReplyDelete

  56. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression
    data science interview questions

    ReplyDelete
  57. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. data science course in coimbatore

    ReplyDelete
  58. I love significantly your own post! I look at all post is great. I discovered your personal content using bing search. Discover my webpage is a great one as you.I work to create several content this post. Once more you can thank you and keep it create! Enjoy! bookkeeping data entry

    ReplyDelete
  59. I just got to this amazing site not long ago. I was actually captured with the piece of resources you have got here. Big thumbs up for making such wonderful blog page!

    Simple Linear Regression

    Correlation vs Covariance

    ReplyDelete
  60. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression
    data science interview questions

    ReplyDelete
  61. Thank you for sharing such a really admire your post. Your post is great!
    data science course in Hyderabad

    ReplyDelete
  62. Attend The Data Science Courses Bangalore From ExcelR. Practical Data Science Courses Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Courses Bangalore.
    Data Science Courses Bangalore

    ReplyDelete
  63. Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing
    data scientist course

    ReplyDelete
  64. Attend online training from one of the best training institute Data Science Training in Hyderabad

    ReplyDelete
  65. Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing.
    360DigiTMG

    ReplyDelete
  66. Very impressive and interesting blog found to be well written in a simple manner that everyone will understand and gain the enough knowledge from your blog being more informative is an added advantage for the users who are going through it. Once again nice blog keep it up.

    360DigiTMG Artificial Intelligence Course

    ReplyDelete
  67. Great blog found to be well written in a simple manner that everyone will understand and gain the enough knowledge from your blog being more informative is an added advantage for the users who are going through it. Once again nice blog keep it up.

    360DigiTMG Python Course

    ReplyDelete
  68. Attend The Data Analyst Course From ExcelR. Practical Data Analyst Course Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Analyst Course.
    Data Analyst Course

    ReplyDelete
  69. This Was An Amazing ! I Haven't Seen This Type of Blog Ever ! Thankyou For Sharing, data sciecne course in hyderabad

    ReplyDelete