Skip to main content

A year of experience as a Data Scientist

On June 3rd 2019, I joined ZS Associates as a Data Scientist after graduating from IIIT SriCity. It was my first job and was very happy to get placed as a Data Scientist through lateral hiring. If you haven’t read my Data Science journey, please read it here :)

After joining, I had some awesome moments that I never experienced since childhood.
  • I got a chance to stay in a 4 star or 5 star hotel multiple times.
  • I got a chance to travel by flight. I travelled to Pune, Delhi and Bangalore. I saw Vizag, Pune, Delhi and Bangalore airports in less than six months. I loved it.
  • A few office parties, outings during Diwali and New year celebrations.
Above are some of the moments that I can never forget in my life. My first job allowed me to experience these first time moments. Enjoying life is more important than anything. If you don’t enjoy your life, you cannot achieve anything big.

Okay, let’s go into the main topic in detail.

Me (inner voice during BTech): I know a bunch of algorithms and participated in a few hackathons. I got some experience in solving problems. I can model anything if I have data. I can start applying for jobs and help industries solve problems using Data Science.

After one year...

Me (inner voice after one year of experience): It’s not as simple as I thought during BTech. Solving real-world problems involves a lot of things. Domain knowledge plays an important role. Learning Python (pandas), Keras alone cannot solve all the problems. It’s all about solving problems with the help of technology and tools available.

I am not saying that I was wrong during BTech. The environment around us makes us think like that. It can be because of the limited information available with us or lack of guidance. But, I suggest students spend more time on learning modelling algorithms, improving mathematical skills, participating in hackathons etc. Because, these are more important skills to have to crack a data science interview. You can build your skills while working on real projects.

In my view, the below points are very important to be a successful Data Scientist:
  1. Domain knowledge: As a data scientist, we are extracting insights from the data. If we are not aware of what the data is about, how can we extract insights? So, data scientists should spend some time understanding the business problem.

  2. Continuous learning: Data Science is an ocean. It cannot be mastered in a few days. Frankly, no one can learn everything related to Data Science. Based on the problem, we should quickly explore new algorithms, tools, etc. Of course, there will be people around to help you if you are stuck at something.

  3. Enjoying your role: Sometimes, data science can be frustrating. You may not find enough resources on recent developments or you may face different challenges that were not solved before. In these situations, be patient and enjoy your work. Be curious to learn and explore new technologies.
One more point that I want to stress here. We are using Data Science to help and grow businesses. The domain in which we are applying Data Science can be different, but the main goal is the same i.e., using data to grow businesses. For example, Amazon uses customer data to improve their recommendation systems. This will help customers to buy more items from Amazon and in turn, helps them to grow their sales/business. In the above example, Amazon is using data science to grow their business.

In some cases, Data Science alone cannot impact customers/end users. To build a successful tool, many teams are involved starting from Software Engineers, Data Engineers, Dev Ops Engineers, etc. It takes effort from multiple teams to build a product. So, we should respect all roles. This is something that everyone should understand.

Coming to my learnings, I explored new algorithms/techniques and solved problems using PySpark, Dask, SQL, python libraries, etc. The list goes on based on the requirement. As a data scientist, learning shouldn’t stop. It’s really fun :)

As a daily routine, I do experiments, involve in brainstorming sessions, discuss insights, write emails, explore new algorithms, etc.

The above information is based on my experience and views. It may change based on team, company, etc. Please note that this article is only for educational purposes.

Thank you so much for reading my blog and supporting me. Stay tuned for my next article. If you want to receive email updates, don’t forget to subscribe to my blog.

Follow me here:
If you are looking for any specific blog, please do comment in the comment section below.


  1. I wanted to thank you for this great Information and knowledge, I definitely loved every little bit of it. I have you bookmarked your web site to check out the latest stuff you post. Your blog is eye-catching. I get pleasure from it. Thanks for sharing this beautiful piece of writing with me,
    get more information at Data Science course in hyderabad for more BR technology related information and knowledge.

  2. You have shared a nice article here about the Big Data Hadoop Training Course . Your article is very informative and nicely describes the causes and natural remedies of Snoring. I am thankful to you for sharing this article here.Big Data Hadoop Training Course In Hyderabad

  3. Awesome article! You are providing us very valid information. This is worth reading. Keep sharing more such articles.
    Data Science Course in Chennai
    Data Science Online Training
    Data Science Course in Coimbatore

  4. Want to Fill Your IT Career GAP? If you looking for the Best Certified and Trustable Experience Certificate Provider in Chennai, India So There are Lots of Consultancy Here But Dreamsoft Consultancy is The Best Consultancy.

    Experience Certificate Provider in Chennai- The Career of Your Life
    Experience Certificate Provider in Bangalore- Once Driven, Forever Career

  5. Very informative article, You explain all aspects of topic and guied your reader precisely. Check out the classified platform to testing and tagging Melbourne

  6. The importance of data analytics in any sector is compounded, creating enormous quantities of knowledge that can provide useful insights into the field. In the last ten years, this has led to a surge in the data market. In order to gain decision-making insights, the compilation of data can be supplemented by its analysis. Data analytics help organizations and businesses gain insight into the enormous amount of knowledge they need for further production and growth. inetSoft

  7. I have read all the comments and suggestions posted by the visitors for this article are very fine, we will wait for your next article so only. Thanks!

    Complete MIS Training Course by Certified Institute in Delhi, NCR
    Core- Advanced Java Training Course in Delhi, NCR

  8. After reading the post about iPad pro keyboard case, I got some descriptive information which is very helpful for me. Thanks for posting it. Keep it up. Nokia Cases Covers & Accessories.Big Data Solutions

  9. Wow about Data science course online, your post is really very useful thanks for sharing. It's really informative. keep sharing more with us.

  10. I really enjoyed while reading your article, the information you have mentioned in this post is really good. I am waiting for your upcoming post.

    Full Stack Java Developer - No coding experience required
    Authorized training center of Autodesk. It is the best Auto CAD training in Delhi NCR

  11. This is very educational content and written well for a change. It's nice to see that some people still understand how to write a quality post.

    if anyone want to Fill Your Career GAP So Contact Here . We are Certified Consultancy Who Fill Candidates Career GAP. For More Details Here-9599119376

    Get Experience Certificate Provider in Hyderabad
    Most Experience Consultancy for Experience Certificate Provider in Bangalore

  12. A data scientist is a person who uses data to create value. Such a person gathers data from a variety of sources and analyses it in order to have a better understanding of how the business operates and to develop AI solutions to automate specific activities within the organization. Your article is very informative and nicely describes the causes and natural remedies of Snoring. The experience shared by you is very motivational for those students who have just completed their graduation. I hope that you will continue to share such important articles in the future also. Thanks for sharing!

  13. Very well written article. It was an awesome article to read. about Fpga Design Training Institutes with Placement Complete rich content and fully informative. I totally Loved it.

  14. Lots of valuable data can be taken from your article about a mattress. I am happy that you have shared great info with us, It is a gainful article for us. Thankful to you for sharing an article like this.Data Extraction Websites

  15. I really appreciate the details you shared with me about areas where I can improve. Looking forward on

  16. I appreciate you sharing the information you have provided. It is quite useful and is informative because Python Data Analyssis USA contains some of the most useful information.

  17. Great post, your all points fully clarify These steps are helpful for Thank you for providing such a valuable information. online programming for middle school students

  18. We are delighted to read your post; it is quite nice and educational, as it offers some of the most essential information. erp customization

  19. The discussion on machine learning algorithms in this article is informative and well-structured. Best Data Science Institute In Chennai

  20. Are you looking for the best Data Science Training in Delhi? Look no further than APTRON Delhi! With its comprehensive and industry-focused curriculum, APTRON Delhi stands out as a leading institute for Data Science training. Whether you're a beginner or an experienced professional, APTRON Delhi offers the perfect platform to enhance your skills and thrive in the field of data science.


Post a Comment

Popular posts from this blog

Google Colab - Increase RAM upto 25GB

Google colab is a free jupyter notebook that is hosted on Google cloud servers. We can use CPU, GPU and TPU for free. It helps you to write and execute your code. You can directly access this through your browser. If you want to use Google Cloud/AWS, it requires hell lot of setup. You have to spin a cluster, create a notebook and then use it. But, Google colab will be readily available for you to use it. You can also install libraries from the notebook itself. These notebooks are very useful for training large models and processing huge datasets. Students and developers can make use of this because it’s very difficult for them to afford GPUs and TPUs. I was trying to run a memory heavy job. The notebook crashed. Then, I came to know how I can increase the RAM. So, I thought of sharing it in my blog. There are some constraints with the notebook. You can run these notebooks for not more than 12 hours and you can use only 12 GB RAM. There is no direct method or button t

Skills required to become a Data Scientist

Data Science is one of the hottest areas in the 21st century. We can solve many complex problems using a huge amount of information. The way electricity has changed the world, information helps us to make our lives easier and comfortable. Every second, an enormous amount of data is being generated. The data may be in the form of text, image, speech or tabular. As there is a lot of growth in the field of Data Science, in recent years, most of the companies have started building their own Data Science teams to get benefited from the information they have. This has created a lot of opportunities and demand for Data Science in different domains. For the next 5+ years, this demand would continue to increase. If we have the right skills, companies are ready to offer salaries more than the market standards. So, this is the right time to explore and gain skills which enables you to enter into this field. We have discussed the importance and demand for data science in the market. Let’s disc

Top 35 frequently asked Data Science interview questions

Interviews are very stressful. We should prepare for the worse. So, we have to plan accordingly in order to crack them. In this blog, you will get to know the type of questions that will be asked during the interview. It also depends on the experience level and the company too. This blog is mainly focused on entry-level Data Science related jobs. If you haven’t read my previous blog-posts, I highly recommend you to go through them: Skills required to become a Data Scientist How to apply for a Data Science job? First of all, you must be thorough with your resume, mainly your Internship experience and academic projects. You will have at least one project discussion round. Take mock interviews and improve your technical and presentation skills, which will surely help in the interviews. Based on my experience, I have curated the topmost 35 frequently asked Data Science questions during the interviews. Explain the Naive Bayes classifier? In case of Regression, how do y

My Data Science Journey and Suggestions - Part 1

I always wanted to share my detailed Data Science journey. So, I have divided the whole journey from BTech first year to final year into 3 parts. I will share everything, without leaving a single detail, starting from my projects, internships to getting a job. You can follow the path that I have followed if you like my journey or create your own path. In 2015, I got a seat in Electronics and Communication Engineering (ECE), IIIT Sri City through IIT JEE Mains. Because of my rank in JEE Mains, I couldn’t get into the Computer Science department. I wanted to shift to Computer Science after my first year, but couldn’t due to some reasons. In our college, we have only two branches, CSE and ECE. For the first three semesters, the syllabus was the same for both the departments except for a few courses. This helped me to explore Computer Science. In the first 3 semesters, I took Computer Programming, Data Structures, Algorithms, Computer Organization, Operation Systems courses, wh

Exploratory Data Analysis and Data Preprocessing steps

Exploratory Data Analysis is the foremost step while solving a Data Science problem. EDA helps us to solve 70% of the problem. We should understand the importance of exploring the data. In general, Data Scientists spend most of their time exploring and preprocessing the data. EDA is the key to building high-performance models. In this article, I will tell you the importance of EDA and preprocessing steps you can do before you dive into modeling. I have divided the article into two parts: Exploratory Data Analysis Data Preprocessing Steps Exploratory Data Analysis Exploratory Data Analysis(EDA) is an art. It’s all about understanding and extracting insights from the data. When you solve a problem using Data Science, it is very important to have domain knowledge. This helps us to get the insights better according to the business problem. We can find the magic features from the data, which boost the performance. We can do the following with EDA. Get comfortable with

Building ML Pipelines using Scikit Learn and Hyper Parameter Tuning

Data Scientists often build Machine learning pipelines which involves preprocessing (imputing null values, feature transformation, creating new features), modeling, hyper parameter tuning. There are many transformations that need to be done before modeling in a particular order. Scikit learn provides us with the Pipeline class to perform those transformations in one go. Pipeline serves multiple purposes here (from documentation ): Convenience and encapsulation : You only have to call fit and predict once on your data to fit a whole sequence of estimators. Joint parameter selection : You can grid search over parameters of all estimators in the pipeline at once (hyper-parameter tuning/optimization). Safety : Pipelines help avoid leaking statistics from your test data into the trained model in cross-validation, by ensuring that the same samples are used to train the transformers and predictors. In this article, I will show you How to build a complete pi

SHAP - An approach to explain the output of any ML model (with Python code)

Can we explain the output of complex tree models? We use different algorithms to improve the performance of the model. If you input a new test datapoint into the model, it will produce an output. Did you ever explore which features are causing to produce the output? We can extract the overall feature importance from the model, but can we get which features are responsible for the output? If we use a decision tree, we can at least explain the output by plotting the tree structure. But, it’s not easy to explain the output for advanced tree-based algorithms like XGBoost, LightGBM, CatBoost or other scikit-learn models. To explain the output for the above algorithms, researches have come up with an approach called SHAP. SHAP (SHapley Additive exPlanations) is a unified approach to explain the output of any machine learning model. SHAP connects game theory with local explanations, uniting several previous methods and representing the only possible consistent and locally accurate ad

Latent Dirichlet Allocation - LDA (With Python code)

Latent Dirichlet Allocation, also known as LDA, is one of the most popular methods for topic modelling. Using LDA, we can easily discover the topics that a document is made of. LDA assumes that the documents are a mixture of topics and each topic contain a set of words with certain probabilities. For example, consider the below sentences: Apple and Banana are fruits. I bought a bicycle recently. In less than two years, I will buy a bike. The colour of the apple and bicycle are red. The output of LDA would look like this: Topic 1 : 0.7*apple + 0.3*banana Topic 2 : 0.6*bicycle + 0.4*bike Sentence 1 : [(Topic 1, 1), (Topic 2, 0)] Sentence 2 : [(Topic 1, 0), (Topic 2, 1)] Sentence 3 : [(Topic 1, 0.5), (Topic 2, 0.5)] Please note that the above probabilities are made up numbers for intuition. To extract the topics and probability of words using LDA, we should decide the number of topics (k) beforehand. Based on that, LDA discovers the topic distribution of docum

Complete Data Science Pipeline

Data Science is not just modelling. To extract value out from Data Science, it needs to be integrated with business and deploy the product to make it available for the users. To build a Data Science product, it needs to go through several steps. In this article, I will discuss the complete Data Science pipeline. Steps involved in building a Data Science product: Understanding the Business problem Data Collection Data Cleaning Exploratory Data Analysis Modelling Deployment Let us discuss each step in detail. Understanding the business problem: We use Data Science to solve a problem. Without understanding the problem, we can’t apply data science and solve it. Understanding the business is very important in building a data science product. The model which we build completely depends on the problem we are solving. If the requirement is different, we need to adjust our algorithm such that it solves the problem. For example, if we are build