• Strong experience working with Big Data technologies like Spark (Scala/Java),
• Apache Solr, HIVE, HBase, ElasticSearch, MongoDB, Airflow, Oozie, etc.
• Experience working with Relational databases like MySQL, SQLServer, Oracle etc.
• Good understanding of large system architecture and design
• Experience working in AWS/Azure cloud environment is a plus
• Experience using Version Control tools such as Bitbucket/GIT code repository
• Experience using tools like Maven/Jenkins, JIRA
• Experience working in an Agile software delivery environment, with exposure to
continuous integration and continuous delivery tools
• Passionate about technology and delivering solutions to solve complex business
problems
• Great collaboration and interpersonal skills
• Ability to work with team members and lead by example in code, feature
development, and knowledge sharing
About HL
Similar jobs
LogiNext is looking for a technically savvy and passionate Senior Software Engineer - Data Science to analyze large amounts of raw information to find patterns that will help improve our company. We will rely on you to build data products to extract valuable business insights.
In this role, you should be highly analytical with a knack for analysis, math and statistics. Critical thinking and problem-solving skills are essential for interpreting data. We also want to see a passion for machine-learning and research.
Your goal will be to help our company analyze trends to make better decisions. Without knowledge of how the software works, data scientists might have difficulty in work. Apart from experience in developing R and Python, they must know modern approaches to software development and their impact. DevOps continuous integration and deployment, experience in cloud computing are everyday skills to manage and process data.
Responsibilities :
Adapting and enhancing machine learning techniques based on physical intuition about the domain Design sampling methodology, prepare data, including data cleaning, univariate analysis, missing value imputation, , identify appropriate analytic and statistical methodology, develop predictive models and document process and results Lead projects both as a principal investigator and project manager, responsible for meeting project requirements on schedule and on budget Coordinate and lead efforts to innovate by deriving insights from heterogeneous sets of data generated by our suite of Aerospace products Support and mentor data scientists Maintain and work with our data pipeline that transfers and processes several terabytes of data using Spark, Scala, Python, Apache Kafka, Pig/Hive & Impala Work directly with application teams/partners (internal clients such as Xbox, Skype, Office) to understand their offerings/domain and help them become successful with data so they can run controlled experiments (a/b testing) Understand the data generated by experiments, and producing actionable, trustworthy conclusions from them Apply data analysis, data mining and data processing to present data clearly and develop experiments (ab testing) Work with development team to build tools for data logging and repeatable data tasks tol accelerate and automate data scientist duties
Requirements:
Bachelor’s or Master’s degree in Computer Science, Math, Physics, Engineering, Statistics or other technical field. PhD preferred 4 to 7 years of experience in data mining, data modeling, and reporting 3+ years of experience working with large data sets or do large scale quantitative analysis Expert SQL scripting required Development experience in one of the following: Scala, Java, Python, Perl, PHP, C++ or C# Experience working with Hadoop, Pig/Hive, Spark, MapReduce Ability to drive projects Basic understanding of statistics – hypothesis testing, p-values, confidence intervals, regression, classification, and optimization are core lingo Analysis - Should be able to perform Exploratory Data Analysis and get actionable insights from the data, with impressive visualization. Modeling - Should be familiar with ML concepts and algorithms; understanding of the internals and pros/cons of models is required. Strong algorithmic problem-solving skills Experience manipulating large data sets through statistical software (ex. R, SAS) or other methods Superior verbal, visual and written communication skills to educate and work with cross functional teams on controlled experiments Experimentation design or A/B testing experience is preferred. Experince in team management.
LogiNext is looking for a technically savvy and passionate Junior Software Engineer - Data Science to analyze large amounts of raw information to find patterns that will help improve our company. We will rely on you to build data products to extract valuable business insights.
In this role, you should be highly analytical with a knack for analysis, math and statistics. Critical thinking and problem-solving skills are essential for interpreting data. We also want to see a passion for machine-learning and research.
Your goal will be to help our company analyze trends to make better decisions. Without knowledge of how the software works, data scientists might have difficulty in work. Apart from experience in developing R and Python, they must know modern approaches to software development and their impact. DevOps continuous integration and deployment, experience in cloud computing are everyday skills to manage and process data.
Responsibilities:
Identify valuable data sources and automate collection processes Undertake preprocessing of structured and unstructured data Analyze large amounts of information to discover trends and patterns Build predictive models and machine-learning algorithms Combine models through ensemble modeling Present information using data visualization techniques Propose solutions and strategies to business challenges Collaborate with engineering and product development teams
Requirements:
Bachelors degree or higher in Computer Science, Information Technology, Information Systems, Statistics, Mathematics, Commerce, Engineering, Business Management, Marketing or related field from top-tier school 0 to 1 year experince in in data mining, data modeling, and reporting. Understading of SaaS based products and services. Understanding of machine-learning and operations research Knowledge of R, SQL and Python; familiarity with Scala, Java or C++ is an asset Knowledge using business intelligence tools (e.g. Tableau) and data frameworks (e.g. Hadoop) Analytical mind and business acumen and problem-solving aptitude Excellent communication and presentation skills Proficiency in Excel for data management and manipulation Experience in statistical modeling techniques and data wrangling Able to work independently and set goals keeping business objectives in mind
Job Description
Mandatory Requirements
-
Experience in AWS Glue
-
Experience in Apache Parquet
-
Proficient in AWS S3 and data lake
-
Knowledge of Snowflake
-
Understanding of file-based ingestion best practices.
-
Scripting language - Python & pyspark
CORE RESPONSIBILITIES
-
Create and manage cloud resources in AWS
-
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
-
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
-
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
-
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
-
Define process improvement opportunities to optimize data collection, insights and displays.
-
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
-
Identify and interpret trends and patterns from complex data sets
-
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
-
Key participant in regular Scrum ceremonies with the agile teams
-
Proficient at developing queries, writing reports and presenting findings
-
Mentor junior members and bring best industry practices.
QUALIFICATIONS
-
5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
-
Strong background in math, statistics, computer science, data science or related discipline
-
Advanced knowledge one of language: Java, Scala, Python, C#
-
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
-
Proficient with
-
Data mining/programming tools (e.g. SAS, SQL, R, Python)
-
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
-
Data visualization (e.g. Tableau, Looker, MicroStrategy)
-
Comfortable learning about and deploying new technologies and tools.
-
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
-
Good written and oral communication skills and ability to present results to non-technical audiences
-
Knowledge of business intelligence and analytical tools, technologies and techniques.
Familiarity and experience in the following is a plus:
-
AWS certification
-
Spark Streaming
-
Kafka Streaming / Kafka Connect
-
ELK Stack
-
Cassandra / MongoDB
-
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools
- Play a critical role as a member of the leadership team in shaping and supporting our overall company vision, day-to-day operations, and culture.
- Set the technical vision and build the technical product roadmap from launch to scale; including defining long-term goals and strategies
- Define best practices around coding methodologies, software development, and quality assurance
- Define innovative technical requirements and systems while balancing time, feasibility, cost and customer experience
- Build and support production products
- Ensure our internal processes and services comply with privacy and security regulations
- Establish a high performing, inclusive engineering culture focused on innovation, execution, growth and development
- Set a high bar for our overall engineering practices in support of our mission and goals
- Develop goals, roadmaps and delivery dates to help us scale quickly and sustainably
- Collaborate closely with Product, Business, Marketing and Data Science
- Experience with financial and transactional systems
- Experience engineering for large volumes of data at scale
- Experience with financial audit and compliance is a plus
- Experience building a successful consumer facing web and mobile apps at scale
1. Communicate with the clients and understand their business requirements.
2. Build, train, and manage your own team of junior data engineers.
3. Assemble large, complex data sets that meet the client’s business requirements.
4. Identify, design and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
5. Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources, including the cloud.
6. Assist clients with data-related technical issues and support their data infrastructure requirements.
7. Work with data scientists and analytics experts to strive for greater functionality.
Skills required: (experience with at least most of these)
1. Experience with Big Data tools-Hadoop, Spark, Apache Beam, Kafka etc.
2. Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
3. Experience in ETL and Data Warehousing.
4. Experience and firm understanding of relational and non-relational databases like MySQL, MS SQL Server, Postgres, MongoDB, Cassandra etc.
5. Experience with cloud platforms like AWS, GCP and Azure.
6. Experience with workflow management using tools like Apache Airflow.
Location: Chennai- Guindy Industrial Estate
Duration: Full time role
Company: Mobile Programming (https://www.mobileprogramming.com/" target="_blank">https://www.
Client Name: Samsung
We are looking for a Data Engineer to join our growing team of analytics experts. The hire will be
responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing
data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline
builder and data wrangler who enjoy optimizing data systems and building them from the ground up.
The Data Engineer will support our software developers, database architects, data analysts and data
scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout
ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple
teams, systems and products.
Responsibilities for Data Engineer
Create and maintain optimal data pipeline architecture,
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data
from a wide variety of data sources using SQL and AWS big data technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer
acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.
Qualifications for Data Engineer
Experience building and optimizing big data ETL pipelines, architectures and data sets.
Advanced working SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as working familiarity with a variety of databases.
Experience performing root cause analysis on internal and external data and processes to
answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and
workload management.
A successful history of manipulating, processing and extracting value from large disconnected
datasets.
Working knowledge of message queuing, stream processing and highly scalable ‘big data’ data
stores.
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
We are looking for a candidate with 3-6 years of experience in a Data Engineer role, who has
attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
Experience with big data tools: Spark, Kafka, HBase, Hive etc.
Experience with relational SQL and NoSQL databases
Experience with AWS cloud services: EC2, EMR, RDS, Redshift
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, Scala, etc.
Skills: Big Data, AWS, Hive, Spark, Python, SQL