Data Scientist
About
Similar jobs
ROLE AND RESPONSIBILITIES
Should be able to work as an individual contributor and maintain good relationship with stakeholders. Should
be proactive to learn new skills per business requirement. Familiar with extraction of relevant data, cleanse and
transform data into insights that drive business value, through use of data analytics, data visualization and data
modeling techniques.
QUALIFICATIONS AND EDUCATION REQUIREMENTS
Technical Bachelor’s Degree.
Non-Technical Degree holders should have 1+ years of relevant experience.
CORE RESPONSIBILITIES
- Create and manage cloud resources in AWS
- Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
- Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
- Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
- Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
- Define process improvement opportunities to optimize data collection, insights and displays.
- Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
- Identify and interpret trends and patterns from complex data sets
- Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
- Key participant in regular Scrum ceremonies with the agile teams
- Proficient at developing queries, writing reports and presenting findings
- Mentor junior members and bring best industry practices
QUALIFICATIONS
- 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
- Strong background in math, statistics, computer science, data science or related discipline
- Advanced knowledge one of language: Java, Scala, Python, C#
- Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
- Proficient with
- Data mining/programming tools (e.g. SAS, SQL, R, Python)
- Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
- Data visualization (e.g. Tableau, Looker, MicroStrategy)
- Comfortable learning about and deploying new technologies and tools.
- Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
- Good written and oral communication skills and ability to present results to non-technical audiences
- Knowledge of business intelligence and analytical tools, technologies and techniques.
Mandatory Requirements
- Experience in AWS Glue
- Experience in Apache Parquet
- Proficient in AWS S3 and data lake
- Knowledge of Snowflake
- Understanding of file-based ingestion best practices.
- Scripting language - Python & pyspark
-
Understand long-term and short-term business requirements to precision match it with the capabilities of different distributed storage and computing technologies from the plethora of options available in the ecosystem.
-
Create complex data processing pipelines
-
Design scalable implementations of the models developed by our Data Scientist.
-
Deploy data pipelines in production systems based on CICD practices
-
Create and maintain clear documentation on data models/schemas as well as
transformation/validation rules
-
Troubleshoot and remediate data quality issues raised by pipeline alerts or downstream consumers
You will:
- Create highly scalable AWS micro-services utilizing cutting edge cloud technologies.
- Design and develop Big Data pipelines handling huge geospatial data.
- Bring clarity to large complex technical challenges.
- Collaborate with Engineering leadership to help drive technical strategy.
- Project scoping, planning and estimation.
- Mentor and coach team members at different levels of experience.
- Participate in peer code reviews and technical meetings.
- Cultivate a culture of engineering excellence.
- Seek, implement and adhere to standards, frameworks and best practices in the industry.
- Participate in on-call rotation.
You have:
- Bachelor’s/Master’s degree in computer science, computer engineering or relevant field.
- 5+ years of experience in software design, architecture and development.
- 5+ years of experience using object-oriented languages (Java, Python).
- Strong experience with Big Data technologies like Hadoop, Spark, Map Reduce, Kafka, etc.
- Strong experience in working with different AWS technologies.
- Excellent competencies in data structures & algorithms.
Nice to have:
- Proven track record of delivering large scale projects, and an ability to break down large tasks into smaller deliverable chunks
- Experience in developing high throughput low latency backend services
- Affinity to spatial data structures and algorithms.
- Familiarity with Postgres DB, Google Places or Mapbox APIs
What we offer
At GroundTruth, we want our employees to be comfortable with their benefits so they can focus on doing the work they love.
- Unlimited Paid Time Off
- In Office Daily Catered Lunch
- Fully stocked snacks/beverages
- 401(k) employer match
- Health coverage including medical, dental, vision and option for HSA or FSA
- Generous parental leave
- Company-wide DEIB Committee
- Inclusion Academy Seminars
- Wellness/Gym Reimbursement
- Pet Expense Reimbursement
- Company-wide Volunteer Day
- Education reimbursement program
- Cell phone reimbursement
- Equity Analysis to ensure fair pay
Job Description
- Solid technical skills with a proven and successful history working with data at scale and empowering organizations through data
- Big data processing frameworks: Spark, Scala, Hadoop, Hive, Kafka, EMR with Python
- Advanced experience and hands-on architecture and administration experience on big data platforms
Ganit has flipped the data science value chain as we do not start with a technique but for us, consumption comes first. With this philosophy, we have successfully scaled from being a small start-up to a 200 resource company with clients in the US, Singapore, Africa, UAE, and India.
We are looking for experienced data enthusiasts who can make the data talk to them.
You will:
- Understand business problems and translate business requirements into technical requirements.
- Conduct complex data analysis to ensure data quality & reliability i.e., make the data talk by extracting, preparing, and transforming it.
- Identify, develop and implement statistical techniques and algorithms to address business challenges and add value to the organization.
- Gather requirements and communicate findings in the form of a meaningful story with the stakeholders
- Build & implement data models using predictive modelling techniques. Interact with clients and provide support for queries and delivery adoption.
- Lead and mentor data analysts.
We are looking for someone who has:
- Apart from your love for data and ability to code even while sleeping you would need the following.
- Minimum of 02 years of experience in designing and delivery of data science solutions.
- You should have successful projects of retail/BFSI/FMCG/Manufacturing/QSR in your kitty to show-off.
- Deep understanding of various statistical techniques, mathematical models, and algorithms to start the conversation with the data in hand.
- Ability to choose the right model for the data and translate that into a code using R, Python, VBA, SQL, etc.
- Bachelors/Masters degree in Engineering/Technology or MBA from Tier-1 B School or MSc. in Statistics or Mathematics
Skillset Required:
- Regression
- Classification
- Predictive Modelling
- Prescriptive Modelling
- Python
- R
- Descriptive Modelling
- Time Series
- Clustering
What is in it for you:
- Be a part of building the biggest brand in Data science.
- An opportunity to be a part of a young and energetic team with a strong pedigree.
- Work on awesome projects across industries and learn from the best in the industry, while growing at a hyper rate.
Please Note:
At Ganit, we are looking for people who love problem solving. You are encouraged to apply even if your experience does not precisely match the job description above. Your passion and skills will stand out and set you apart—especially if your career has taken some extraordinary twists and turns over the years. We welcome diverse perspectives, people who think rigorously and are not afraid to challenge assumptions in a problem. Join us and punch above your weight!
Ganit is an equal opportunity employer and is committed to providing a work environment that is free from harassment and discrimination.
All recruitment, selection procedures and decisions will reflect Ganit’s commitment to providing equal opportunity. All potential candidates will be assessed according to their skills, knowledge, qualifications, and capabilities. No regard will be given to factors such as age, gender, marital status, race, religion, physical impairment, or political opinions.
The thrill of working at a start-up that is starting to scale massively is something else. Simpl (FinTech startup of the year - 2020) was formed in 2015 by Nitya Sharma, an investment banker from Wall Street and Chaitra Chidanand, a tech executive from the Valley, when they teamed up with a very clear mission - to make money simple so that people can live well and do amazing things. Simpl is the payment platform for the mobile-first world, and we’re backed by some of the best names in fintech globally (folks who have invested in Visa, Square and Transferwise), and
has Joe Saunders, Ex Chairman and CEO of Visa as a board member.
Everyone at Simpl is an internal entrepreneur who is given a lot of bandwidth and resources to create the next breakthrough towards the long term vision of “making money Simpl”. Our first product is a payment platform that lets people buy instantly, anywhere online, and pay later. In
the background, Simpl uses big data for credit underwriting, risk and fraud modelling, all without any paperwork, and enables Banks and Non-Bank Financial Companies to access a whole new consumer market.
In place of traditional forms of identification and authentication, Simpl integrates deeply into merchant apps via SDKs and APIs. This allows for more sophisticated forms of authentication that take full advantage of smartphone data and processing power
Skillset:
Workflow manager/scheduler like Airflow, Luigi, Oozie
Good handle on Python
ETL Experience
Batch processing frameworks like Spark, MR/PIG
File formats: parquet, JSON, XML, thrift, avro, protobuff
Rule engine (drools - business rule management system)
Distributed file systems like HDFS, NFS, AWS, S3 and equivalent
Built/configured dashboards
Nice to have:
Data platform experience for eg: building data lakes, working with near - realtime
applications/frameworks like storm, flink, spark.
AWS
File encoding types: Thrift, Avro, Protobuff, Parquet, JSON, XML
HIVE, HBASE
Senior Big Data Engineer
Note: Notice Period : 45 days
Banyan Data Services (BDS) is a US-based data-focused Company that specializes in comprehensive data solutions and services, headquartered in San Jose, California, USA.
We are looking for a Senior Hadoop Bigdata Engineer who has expertise in solving complex data problems across a big data platform. You will be a part of our development team based out of Bangalore. This team focuses on the most innovative and emerging data infrastructure software and services to support highly scalable and available infrastructure.
It's a once-in-a-lifetime opportunity to join our rocket ship startup run by a world-class executive team. We are looking for candidates that aspire to be a part of the cutting-edge solutions and services we offer that address next-gen data evolution challenges.
Key Qualifications
· 5+ years of experience working with Java and Spring technologies
· At least 3 years of programming experience working with Spark on big data; including experience with data profiling and building transformations
· Knowledge of microservices architecture is plus
· Experience with any NoSQL databases such as HBase, MongoDB, or Cassandra
· Experience with Kafka or any streaming tools
· Knowledge of Scala would be preferable
· Experience with agile application development
· Exposure of any Cloud Technologies including containers and Kubernetes
· Demonstrated experience of performing DevOps for platforms
· Strong Skillsets in Data Structures & Algorithm in using efficient way of code complexity
· Exposure to Graph databases
· Passion for learning new technologies and the ability to do so quickly
· A Bachelor's degree in a computer-related field or equivalent professional experience is required
Key Responsibilities
· Scope and deliver solutions with the ability to design solutions independently based on high-level architecture
· Design and develop the big data-focused micro-Services
· Involve in big data infrastructure, distributed systems, data modeling, and query processing
· Build software with cutting-edge technologies on cloud
· Willing to learn new technologies and research-orientated projects
· Proven interpersonal skills while contributing to team effort by accomplishing related results as needed
- 6+ years of recent hands-on Java development
- Developing data pipelines in AWS or Google Cloud
- Java, Python, JavaScript programming languages
- Great understanding of designing for performance, scalability, and reliability of data intensive application
- Hadoop MapReduce, Spark, Pig. Understanding of database fundamentals and advanced SQL knowledge.
- In-depth understanding of object oriented programming concepts and design patterns
- Ability to communicate clearly to technical and non-technical audiences, verbally and in writing
- Understanding of full software development life cycle, agile development and continuous integration
- Experience in Agile methodologies including Scrum and Kanban