logo
  • Home
  • Blog-Navigation
  • Art
  • About
  • Contact

R vs Python – a Comparison, and Awesome Free Books to Learn these Languages

· Roopam Upadhyay 16 Comments
Please read the disclaimer about the Free PDF Books in this article at the bottom

The one thing they love more than a hero is to see a hero fail, fall, die trying. In spite of everything you’ve done for them, eventually, they will hate you [Spider-Man].

– Green Goblin / Norman Osborn

R Vs. Python - by Roopam

R vs Python- by Roopam

Batman v Superman: Dawn of Justice was released in March 2016. The film didn’t do too well but it’s an interesting idea to make these two superheroes battle it out against each other. Both these superheroes were introduced through comic books in the late 1930s by DC Comics. Both of them fight crime and criminals. However, in over 75 years they have developed into characters that are contrasting to each other. They are as different as day and night. Superman represents the bright, sunny side of life while Batman the dark, chilling nights. Notably, Superman gets his superpowers from the sun while fear of bats and dark nights are the source of power for Batman.

R vs Python – Superheroes

Let us continue with the theme of contrasting superheroes with a common mission. This time, we will make the superheroes for data scientists compete against each other – R vs Python. The idea for this article is to explain the superpowers of both R and Python, and also to suggest books to learn them. Most of these books are available online for free for the purpose of evaluation, and I will share those links here. To explain superpowers of R and Python let me create a few connections between them and the DC Comic superheroes.

You may find it unusual but I see a few similarities between R and Batman. Moreover, for me, Python and Superman have some things in common as well. Let me create a table to list these similarities.

Analysis Tool Similar Superhero  Super Powers in Common
RR
Batmanbatman
  • Detective Work
  • Intelligence
  • Cunning
  • Usage of Tools
  • More Brain than Muscles
Pythonpylogo
SupermanSuperman
  • Muscle Power
  • Super Strength
  • Elegance
  • Wide Range
  • More Muscles than Brain

Let me try to explain the reasons for these distinctions between R and Python in the next segment. Also, let us figure out a good approach for data scientists while using these languages.

R vs Python / R and Python : Which is a Good Approach?

Both R and Python are open sources and free to use high-level programming languages. R is specifically developed for statistical computing. It has plenty of  add-on packages / tools to support machine learning and data analysis. On the other hand, Python is a general purpose and powerful programming language with special applications in data preparation, data munging, and data analysis.

This distinction is also the reason for different communities of analysts to prefer either of these languages. Python is often preferred by computer programmers trying to develop skills in number crunching and analysis. On the other hand, R is preferred by mathematicians and statisticians. This difference is glaring in the learning resources (books and online) for these languages. For instance, consider the following four books for R available online for free (click on the books to read them for free). 

YOU CANalytics Book Rating 5 out of 5 stars (5 / 5) – for all the 4 books mentioned below

Elements of Statistical Learning

Click to Read

An Introduction to Statistical Learning

Click to Read

Forecasting

Click to Read

Doing Bayesian Data Analysis 1

Click to Read

All these books are high-quality statistical texts with R as the preferred language. These are just a few examples. Please note, the first book is not for R, but is by the same authors as the second book. You will rarely find books of this nature with Python as the preferred language. Hence, R is much better equipped to tackle data mining and statistical analysis related problems. On the other hand, Python provides great applications to work with unstructured and complicated datasets like images, written text (web, emails, etc.), genomics, sound etc.

In essence, Python and R together complete the toolkit for a data scientist. Hence, for a pragmatic and application-oriented data scientist, it is essential to understand the super-powers and qualities of both these languages.

R Qualities Python Qualities

Use R for analysis, data visualization, and modeling

  • Offers great flexibility for analysis
  • R makes it is easy to think while doing your analysis
  • Constant upgrades and enhancements of analysis packages because of highly active community in statistics and mathematics
  • Exceptional data visualization tools

Use Python for data preparation, data munging especially for unstructured data like web, images, text etc.

  • Great flexibility and ability to extract information from free text, websites, and social media sites
  • Good with mining images and prepare data for analysis
  • Can handle large volume of data better than R

For a serious data scientist, it is a good idea to have some functional knowledge of both R and Python. Hence, a practical approach is to think of them together as R and Python – instead of R vs Python. In the following section, I will suggest books and online resources for both R and Python.

R – Books and Online Resources

In one of the earlier articles on YOU CANalytics I have suggested many books and online resources to learn R. In that article, I have recently added links to PDF files for the books for R. So, I suggest you revisit that post even if you have read it before. You could find that post on the following link – Learn R : 12-books (Free PDFs) and Online Resources.

R and Python – Books and Online Resources

This book uses both R and Python for marketing analytics. It is rare to find books that use both the languages.

Marketing Data Science

Click to Read

Marketing Data Science: Modeling Techniques in Predictive Analytics with R and Python – Thomas W. Miller

YOU CANalytics Book Rating 4.7 out of 5 stars (4.7 / 5)

“When I prepare data for analysis or work on the web, I use Python. For modeling or graphics, I often use R” – this statement by the author of this book summarizes the way data scientists want to use R and Python. This is an excellent book to learn marketing analytics. The book covers all the major data science activities for marketers including pricing, promotion, product design, recommendation etc. However, before you reach out for this book make sure you have some functional knowledge of either R or Python.

Partial Google Book

Python – Books and Online Resources

Now let me introduce a few books I have found useful to learn Python. I have divided these books into four different categories based on their utilities. These books will be presented under the following categories:

  1. Books: Python for General Purposes of Data Science
  2. Books: Python for Specialized Applications in Data Science
  3. Books: Python for Text Analytics
  4. Books: Python for Image Analytics

Also, be prepared to see some exotic animals on the cover pages of almost all the books to follow.

1. Books: Python for General Purposes of Data Science

python for data analysis

Click to Read

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython – Wes McKinney

YOU CANalytics Book Rating 4.2 out of 5 stars (4.2 / 5)

As I mentioned earlier, Python is excellent when it comes to data preparation, data munging, data wrangling etc. This is a good book to start learning these skills. In this book, a friendly interface IPython is used throughout to code. This makes it easy for beginners and non-programmers. Additionally, it provides good working knowledge of NumPy and Pandas.

Read Full PDF: Python for Data Analysis

Data Science from Scratch

Click to Read

Data Science from Scratch: First Principles with Python – Joel Grus

YOU CANalytics Book Rating 4 out of 5 stars (4 / 5)

This book has a much more balanced approach to theory and programming than most other books available in the market with Python as the choice of language. I still feel there are many better books on R to learn machine learning and statistical aspects of data science. However, if you want to learn these topics through Python, ‘Data Science from Scratch’ is not a bad book to start.

Read Full PDF: Data Science from Scratch

2. Books: Python for Specialized Applications in Data Science

Programming Collective Intelligence

Click to Read

Programming Collective Intelligence: Building Smart Web 2.0 Applications – Toby Segaran

YOU CANalytics Book Rating 5 out of 5 stars (5 / 5)

This is a wonderful book for the following reasons: brilliantly written, fun to read, makes the reader think, and quite practical. While reading this book you can easily figure out that the author loves his subject. Collective intelligence is about making decisions through the wisdom of the crowd instead of one expert opinion. The book introduces practical approaches to extract this knowledge from the web. Given that the book was written in 2007 there are a few outdated codes in this book. However, the underlying principals and ideas are extremely relevant and will continue to be so. I strongly recommend that you read this book.

Read Full PDF : Programming Collective Intelligence (Use the first link in the Google Search)

Mining the Social Web

Click to Read

Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More – Matthew A. Russell

YOU CANalytics Book Rating 4.8 out of 5 stars (4.8 / 5)

Are you interested in mining social media sites? Twitter, Facebook, LinkedIn, Google+: this book has a chapter to extract information from all these sites and more. This is a good book especially to extract information from Twitter. However, I must offer a word of caution: APIs for these social media sites change quite regularly so you will hit a roadblock a few times while using the codes from this book. I suggest you buy the latest edition and refer to the internet during your practice sessions.

Read Full PDF : Mining the Social Web (1st Edition)

3. Books: Python for Text Analytics

Text Processing in Python

Click to Read

Text Processing in Python– David Mertz

YOU CANalytics Book Rating 4.5 out of 5 stars (4.5 / 5)

One of the most complicated problems in machine learning is to extract meaning from a free flowing text through algorithms. This book is going to introduce you to the wonderful world of text processing in an intuitive fashion. You will learn about string functions and operations, regular expression, text parsing etc. This is a great book to start your text processing journey. Notice, there are no animals on the cover of this book – how fascinating!

Read Full PDF: Text Processing in Python

Natural Language Processing with Python

Click to Read

Natural Language Processing with Python– Steven Bird and Ewan Klein

YOU CANalytics Book Rating 3.8 out of 5 stars (3.8 / 5)

The book can be considered as a manual for Python NLTK (Natural Language Toolkit). NLTK is a powerful toolkit to implement natural language processing (NLP) i.e. make machines understand human languages. This book doesn’t cover the theoretical depth and nuances of NLP which is a bit frustrating. However, this is still a good book to learn NLTK.

Read Full PDF: Natural Language Processing with Python

4. Books: Python for Image Analytics

Programming Computer Vision with Python

Click to Read

Programming Computer Vision with Python – Jan Erik Solem

YOU CANalytics Book Rating 4.3 out of 5 stars (4.3 / 5)

A greyscale digital image is just a large matrix of numbers with pixel information. Each color image has 3 matrices with RGB (red-green-blue) level pixel information. A wide screen HD TV has image matrix dimensions of 1920 x 1200 pixels. A color image with these dimensions has over 6 million numbers stored to represent individual pixels for RGB. Now, if you want to learn more about manipulating image matrices and image processing read this book. It is a gentle introduction to computer vision. The question is, can the computer see the world the way you and I see it?

Read Full PDF: Programming Computer Vision with Python

Learning OpenCV

Click to Read

Learning OpenCV: Computer Vision with the OpenCV Library – Gary Bradsk & Adrian Kaehler

YOU CANalytics Book Rating 4.8 out of 5 stars (4.8 / 5)

Computer vision is a fascinating topic as mentioned earlier. While we see pictures of a butterfly, computers see matrices of numbers. The question is how to make the computer identify the butterfly within pixel-numbers? OpenCV (open computer vision) is a powerful C-based library that has answers to this question. OpenCV can be called from Python for image processing. This book is a great introduction to OpenCV. A must read if you want to learn image processing and image analytics.

Read Full PDF: Learning OpenCV

Sign-off Note

The one thing they love more than a hero is to see a hero fail, fall, die trying. In spite of everything you’ve done for them, eventually, they will hate you [Spider-Man].

– Green Goblin / Norman Osborn

I guess, we do love to see our heroes fail. Why else will we make them compete against each other? I don’t know the reason for this. Possibly, as a race, we are sadistic creatures. Possibly, we are just jealous of people better than us. Possibly, we love sadness despite our claims about our love for happiness. Possibly, we relate with the demons these superheroes fight. Possibly, we believe in the futility of life.

All the above reasons are just a half truth to me. For me, a more likely reason is that we have both day and night inside us. Some days it is bright and sunny for us, and the other days it is pitch-dark. Let us embrace the grayness of life. In the same breath, let us embrace both Python and R with their individual insufficiency. Let’s stop pulling our superheroes down.

CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 90

Disclaimer : Roopam Upadhyay or YOU CANalytics has no affiliation to either the authors of the books or the web-sites hosting these PDF books shared in this post. I am assuming that none of the PDF file links I have shared in this article is a copyright infringement since they are among the top Google search results. Several of these files are from either the authors' webpages or from scholarly links. In case you believe otherwise about any link please let me know I will remove that link.
  • Share
  • Twitter
  • Facebook
  • Email
  • LinkedIn

Related

Posted in Analytics Book Club, Analytics Tips and Tricks, Python and R |
« 4 Ps to Bring Data Science to Boardroom @ The Economic Times Business Analytics Summit
Data Analytics Challenge 1 (Clue # 3) : Does Bayes’ Theorem have a Better Answer? »

16 thoughts on “R vs Python – a Comparison, and Awesome Free Books to Learn these Languages”

  1. venkataraman says:
    November 6, 2015 at 8:49 am

    Thanks a lot and very nice way of presentation and suggestion

    Reply
  2. Denisia J says:
    November 8, 2015 at 7:25 pm

    Thank you for discussing the similarities and differences btw R and Python. Nice simplification of their uses and great resource with available texts to learn Python software.

    Reply
  3. M. Reza says:
    November 25, 2015 at 2:20 am

    Hi,

    Nice website and updates. Enjoyed.

    Cheers,

    Reply
  4. Frank Sauvage says:
    November 29, 2015 at 12:56 am

    Thanks a lot for this nice article and very useful resources! As an R addict who senses access to Python’s world would widen the world, I greatly appreciate the recommandation for a good start with the blue and yellow snakes!
    Best wishes

    Reply
  5. Sandeep says:
    December 4, 2015 at 10:40 pm

    Hello Sir,

    I am a bit out of context here in terms of the topic discussed above.

    Can you please help me with some information on ‘Beta Distribution’. How to fit Beta distribution on an existing dataset varying from 0 to 1.

    Reply
    • Roopam Upadhyay says:
      January 25, 2016 at 7:42 am

      There are two methods for this. You could either use maximum likelihood estimate to calculate alpha and beta parameters for the beta distribution, or simply use mean and standard deviation of your data for the same.

      Reply
  6. Jan Peter Axelsson says:
    January 28, 2016 at 4:33 pm

    Hello,
    Thanks for characterizing Python and R somewhat. Works mainly with Python myself and got more interested in R by your article. Interesting with the ref to Hasties et als books. I would just like to add a concise good book to start use Python (together with libraries numpy, matplotlib etc) which is a good book if you know some programming already and focus is on use Python for computations.

    Claus Führer et al, Computing with Python – an introduction to Python for science and engineering, published by Pearsson 214.
    http://www.amazon.com/Computing-Python-introduction-science-engineering-ebook/dp/B00IZI60QC/ref=sr_1_1?ie=UTF8&qid=1453978821&sr=8-1&keywords=fuhrer+computing+with+python

    Reply
  7. Sape says:
    May 4, 2016 at 6:50 pm

    Very nice website.

    Reply
  8. Nagesh says:
    July 29, 2016 at 3:11 am

    Awesome Blogs and topics with free pdfs. Really I will recommend this website to my friends as well..

    Reply
  9. REVA MCHALE says:
    September 3, 2016 at 8:59 am

    Lovely article – one of the best things I’ve recently read, and by far the most useful.

    Reply
  10. Life Skipper (@gelosoil) says:
    December 16, 2016 at 6:36 pm

    Hi all
    Me,i started with R ,without any prior knowledge of programming(even mathematics basics at that time were a mystery to me).
    But putting an effort (in the NYJH coursera course on R programming was the first ) ,I overcame my lack of prior knowledge and this I think was due to R’s easy(in my view)programming style and the language idiosyncrasy.
    Also i guess the teaching style had something to do with it.
    I ve also taken the ML course in Stanford ,by the authors of the 2 books you present (ISLR and Stat.learning)and found it to be an eye opener .(also R used in the course )
    In all,from the +10 R courses i ve taken ,these were the ones that helped me the most.

    I know most people suggest to fresh starters to follow the path from Python to data science,but to me python was like reading hieroglyphics at that time..:)
    But then again, i am not one who s famous for his perception >:):)

    Eventually of course i had to learn python too.As the article well puts it,you need both skills if you are to “play” with data in the real world…
    Plus a few other skills (if your target is to become an analyst of some kind)
    On python i ve taken several introductory courses ,plus a couple with spark (where you will find that knowing python helps).I am at this stage for now,trying to make sense of it all..:)
    regards to all

    Reply
  11. softwaretrainingweb says:
    February 28, 2017 at 12:54 pm

    Hiii…..Your blog about the Python and R programming is really much informative and helpful to all people…Thanks for your informative sharing….

    Reply
  12. Sikander Malik says:
    November 11, 2017 at 8:58 pm

    Thank you

    Reply
  13. Emeka says:
    March 2, 2018 at 3:05 am

    Thank you so much

    Reply
  14. rajat says:
    April 15, 2019 at 2:08 pm

    Python for Data Analysis by Wes McKinney is gold. Why? Because Wes McKinney is one of the creator of Pandas itself [1]. So whatever this book teaches comes directly from the creator.

    Ref.
    [1]

    Reply
    • Rajat says:
      April 29, 2019 at 4:30 pm

      [1] https://en.wikipedia.org/wiki/Wes_McKinney

      Reply

Leave a comment Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Subscribe to Blog

Provide your email address to receive notifications of new posts

Must Read

Career in Data Science - Interview Preparation - Best Practices

Free Books - Machine Learning - Data Science - Artificial Intelligence

Case-Studies

- Marketing Campaign Management - Revenue Estimation & Optimization

Customer Segmentation - Cluster Analysis - Segment wise Business Strategy

- Risk Management - Credit Scorecards

- Sales Forecasting - Time Series Models

Credit

I must thank my wife, Swati Patankar, for being the editor of this blog.

Pages

  • Blog-Navigation
  • Art
  • About
  • Contact
© Roopam Upadhyay
  • Blog-Navigation
  • Art
  • About
  • Contact
 

Loading Comments...