Monday, August 29, 2016

Interview questions in big data field

Dear Technocrats,

In this post we are coming up with a series of interview preparation material for those who are looking to get entry in the field of big data analytics. First lets have some common interview tips for all:

  1. Be very attentive while listening before answering any question.
  2. Be very specific and precise in your answers.
  3. In today's fast paced changing IT industry, the recruiter is more focused on well educated personal than the well trained, understand the difference.
  4. Focus more on the outcome of learning, than the syntax of leaning. Having some idea of business use of your technology domain will be a plus.
  5. Show flexibility, rather than rigidity on any technology or platform specially for freshers. 
  6. Asking one or two questions from the interviewer about his company is thought to be a good practice, but avoid making continuous arguments.
  7. A common question from the interviewer can be " when a person is called successful on this post?" 
 
Now we are coming up with a set of questions which are expected to be asked in your interview.

Basic Questions:

  1. What do you think by big data and what are its solution techniques?
  2. What is the difference between structured and unstructured data? Support your answer with examples.
  3. What do you know about NoSQL databases? How those are different from RDBMS?
  4. What is Mapreduce? Explain its phases in detail.
  5. What is distributed file system? How it is different from usual file systems. Explain both with examples.
  6. What are the limitations/shortcomings of mapreduce framework?
  7. Do you know about IBM Watson? How it is helpful in big data analytics?
  8. Is there any relation in big data analytics and cloud computing?
  9. Define horizontal scalability and its benefits in hadoop framework?
  10. Explain the role and working of Namenode, datanode, Jobtracker & tasktracker.
  11. What is the difference between hadoop 1.x and hadoop 2.x?
  12. Explain Sharding and its importance.

Advance level questions:

  1. How kafka can be integrated with hadoop / spark for stream processing.
  2. What is the use of NiFi in big data processing frameworks.
  3. Which NoSQL database is suited for storage and processing of binary data (images).
  4. What is the difference between RDD & DataFrames in Spark.
For more such questions, discussions on polls & technical articles on latest technologies for big data analytics check out the posts on DataioticsHub Page


If you are new to big data analytics, please start reading basics from this post. To understand and learn complete technology stack on big data engineering, visit DataioticsHub

Saturday, August 20, 2016

Software Defined Networks


Software Defined Networks:

Networking lies in the core of any IT infrastructure. We can't think of any computer based business system which is lacking in networking capabilities. Good networking leads to multi-dimensional growth of a computer based system and any business module relying on it.
So its time to upgrade and expand the network capabilities to meet the growing need of IT industry. The new and upcoming needs of IT industry are changed due to SMAC model of business. Every organization wants to be connected more closely with their customers. Every customer as well as every feedback is important for an organization have an edge over its competitors. This needs very robust and flexible network capabilities from network providers.
As networking devices are costly enough, expansion of network in new areas are quite costly. All these deriving forces gave birth to the advent of "Software Defined Networks".
   SDN comes up with the concept of separating the control logic from the underlying hardware and providing centralized administration to the network. SDN enables improved networking capabilities in cloud data centers.

    From Academic research point of view you can take either  open-source tool NS-3 "Network Simulator-3" or Mininet as your simulation tools and ride of wave of SDN by contributing some good research from your side to the community. NS-3 has OpenFlow as and protocol set module to showcase the functioning of SDN. The module comes up with coding in C++ and optional binding option with python. Basic knowledge of Linux will be a plus in networking domain specially in SDN.

Keep an eye on post update to get deeper aspects of SDN with practical exposure.