Role and Responsibilities
- Execute data mining projects, training and deploying models over a typical duration of 2 -12 months.
- The ideal candidate should be able to innovate, analyze the customer requirement, develop a solution in the time box of the project plan, execute and deploy the solution.
- Integrate the data mining projects embedded data mining applications in the FogHorn platform (on Docker or Android).
Core Qualifications
Candidates must meet ALL of the following qualifications:
- Have analyzed, trained and deployed at least three data mining models in the past. If the candidate did not directly deploy their own models, they will have worked with others who have put their models into production. The models should have been validated as robust over at least an initial time period.
- Three years of industry work experience, developing data mining models which were deployed and used.
- Programming experience in Python is core using data mining related libraries like Scikit-Learn. Other relevant Python mining libraries include NumPy, SciPy and Pandas.
- Data mining algorithm experience in at least 3 algorithms across: prediction (statistical regression, neural nets, deep learning, decision trees, SVM, ensembles), clustering (k-means, DBSCAN or other) or Bayesian networks
Bonus Qualifications
Any of the following extra qualifications will make a candidate more competitive:
- Soft Skills
- Sets expectations, develops project plans and meets expectations.
- Experience adapting technical dialogue to the right level for the audience (i.e. executives) or specific jargon for a given vertical market and job function.
- Technical skills
- Commonly, candidates have a MS or Ph.D. in Computer Science, Math, Statistics or an engineering technical discipline. BS candidates with experience are considered.
- Have managed past models in production over their full life cycle until model replacement is needed. Have developed automated model refreshing on newer data. Have developed frameworks for model automation as a prototype for product.
- Training or experience in Deep Learning, such as TensorFlow, Keras, convolutional neural networks (CNN) or Long Short Term Memory (LSTM) neural network architectures. If you don’t have deep learning experience, we will train you on the job.
- Shrinking deep learning models, optimizing to speed up execution time of scoring or inference.
- OpenCV or other image processing tools or libraries
- Cloud computing: Google Cloud, Amazon AWS or Microsoft Azure. We have integration with Google Cloud and are working on other integrations.
- Decision trees like XGBoost or Random Forests is helpful.
- Complex Event Processing (CEP) or other streaming data as a data source for data mining analysis
- Time series algorithms from ARIMA to LSTM to Digital Signal Processing (DSP).
- Bayesian Networks (BN), a.k.a. Bayesian Belief Networks (BBN) or Graphical Belief Networks (GBN)
- Experience with PMML is of interest (see www.DMG.org).
- Vertical experience in Industrial Internet of Things (IoT) applications:
- Energy: Oil and Gas, Wind Turbines
- Manufacturing: Motors, chemical processes, tools, automotive
- Smart Cities: Elevators, cameras on population or cars, power grid
- Transportation: Cars, truck fleets, trains
About FogHorn Systems
FogHorn is a leading developer of “edge intelligence” software for industrial and commercial IoT application solutions. FogHorn’s Lightning software platform brings the power of advanced analytics and machine learning to the on-premise edge environment enabling a new class of applications for advanced monitoring and diagnostics, machine performance optimization, proactive maintenance and operational intelligence use cases. FogHorn’s technology is ideally suited for OEMs, systems integrators and end customers in manufacturing, power and water, oil and gas, renewable energy, mining, transportation, healthcare, retail, as well as Smart Grid, Smart City, Smart Building and connected vehicle applications.
Press: https://www.foghorn.io/press-room/">https://www.foghorn.io/press-room/
Awards: https://www.foghorn.io/awards-and-recognition/">https://www.foghorn.io/awards-and-recognition/
- 2019 Edge Computing Company of the Year – Compass Intelligence
- 2019 Internet of Things 50: 10 Coolest Industrial IoT Companies – CRN
- 2018 IoT Planforms Leadership Award & Edge Computing Excellence – IoT Evolution World Magazine
- 2018 10 Hot IoT Startups to Watch – Network World. (Gartner estimated 20 billion connected things in use worldwide by 2020)
- 2018 Winner in Artificial Intelligence and Machine Learning – Globe Awards
- 2018 Ten Edge Computing Vendors to Watch – ZDNet & 451 Research
- 2018 The 10 Most Innovative AI Solution Providers – Insights Success
- 2018 The AI 100 – CB Insights
- 2017 Cool Vendor in IoT Edge Computing – Gartner
- 2017 20 Most Promising AI Service Providers – CIO Review
Our Series A round was for $15 million. Our Series B round was for $30 million October 2017. Investors include: Saudi Aramco Energy Ventures, Intel Capital, GE, Dell, Bosch, Honeywell and The Hive.
About the Data Science Solutions team
In 2018, our Data Science Solutions team grew from 4 to 9. We are growing again from 11. We work on revenue generating projects for clients, such as predictive maintenance, time to failure, manufacturing defects. About half of our projects have been related to vision recognition or deep learning. We are not only working on consulting projects but developing vertical solution applications that run on our Lightning platform, with embedded data mining.
Our data scientists like our team because:
- We care about “best practices”
- Have a direct impact on the company’s revenue
- Give or receive mentoring as part of the collaborative process
- Questions and challenging the status quo with data is safe
- Intellectual curiosity balanced with humility
- Present papers or projects in our “Thought Leadership” meeting series, to support continuous learning
Similar jobs
Job Title -Data Scientist
Job Duties
- Data Scientist responsibilities includes planning projects and building analytics models.
- You should have a strong problem-solving ability and a knack for statistical analysis.
- If you're also able to align our data products with our business goals, we'd like to meet you. Your ultimate goal will be to help improve our products and business decisions by making the most out of our data.
Responsibilities
Own end-to-end business problems and metrics, build and implement ML solutions using cutting-edge technology.
Create scalable solutions to business problems using statistical techniques, machine learning, and NLP.
Design, experiment and evaluate highly innovative models for predictive learning
Work closely with software engineering teams to drive real-time model experiments, implementations, and new feature creations
Establish scalable, efficient, and automated processes for large-scale data analysis, model development, deployment, experimentation, and evaluation.
Research and implement novel machine learning and statistical approaches.
Requirements
2-5 years of experience in data science.
In-depth understanding of modern machine learning techniques and their mathematical underpinnings.
Demonstrated ability to build PoCs for complex, ambiguous problems and scale them up.
Strong programming skills (Python, Java)
High proficiency in at least one of the following broad areas: machine learning, statistical modelling/inference, information retrieval, data mining, NLP
Experience with SQL and NoSQL databases
Strong organizational and leadership skills
Excellent communication skills
Job Title: Data Engineer
Job Summary: As a Data Engineer, you will be responsible for designing, building, and maintaining the infrastructure and tools necessary for data collection, storage, processing, and analysis. You will work closely with data scientists and analysts to ensure that data is available, accessible, and in a format that can be easily consumed for business insights.
Responsibilities:
- Design, build, and maintain data pipelines to collect, store, and process data from various sources.
- Create and manage data warehousing and data lake solutions.
- Develop and maintain data processing and data integration tools.
- Collaborate with data scientists and analysts to design and implement data models and algorithms for data analysis.
- Optimize and scale existing data infrastructure to ensure it meets the needs of the business.
- Ensure data quality and integrity across all data sources.
- Develop and implement best practices for data governance, security, and privacy.
- Monitor data pipeline performance / Errors and troubleshoot issues as needed.
- Stay up-to-date with emerging data technologies and best practices.
Requirements:
Bachelor's degree in Computer Science, Information Systems, or a related field.
Experience with ETL tools like Matillion,SSIS,Informatica
Experience with SQL and relational databases such as SQL server, MySQL, PostgreSQL, or Oracle.
Experience in writing complex SQL queries
Strong programming skills in languages such as Python, Java, or Scala.
Experience with data modeling, data warehousing, and data integration.
Strong problem-solving skills and ability to work independently.
Excellent communication and collaboration skills.
Familiarity with big data technologies such as Hadoop, Spark, or Kafka.
Familiarity with data warehouse/Data lake technologies like Snowflake or Databricks
Familiarity with cloud computing platforms such as AWS, Azure, or GCP.
Familiarity with Reporting tools
Teamwork/ growth contribution
- Helping the team in taking the Interviews and identifying right candidates
- Adhering to timelines
- Intime status communication and upfront communication of any risks
- Tech, train, share knowledge with peers.
- Good Communication skills
- Proven abilities to take initiative and be innovative
- Analytical mind with a problem-solving aptitude
Good to have :
Master's degree in Computer Science, Information Systems, or a related field.
Experience with NoSQL databases such as MongoDB or Cassandra.
Familiarity with data visualization and business intelligence tools such as Tableau or Power BI.
Knowledge of machine learning and statistical modeling techniques.
If you are passionate about data and want to work with a dynamic team of data scientists and analysts, we encourage you to apply for this position.
Job Summary
As a Data Science Lead, you will manage multiple consulting projects of varying complexity and ensure on-time and on-budget delivery for clients. You will lead a team of data scientists and collaborate across cross-functional groups, while contributing to new business development, supporting strategic business decisions and maintaining & strengthening client base
- Work with team to define business requirements, come up with analytical solution and deliver the solution with specific focus on Big Picture to drive robustness of the solution
- Work with teams of smart collaborators. Be responsible for their appraisals and career development.
- Participate and lead executive presentations with client leadership stakeholders.
- Be part of an inclusive and open environment. A culture where making mistakes and learning from them is part of life
- See how your work contributes to building an organization and be able to drive Org level initiatives that will challenge and grow your capabilities.
Role & Responsibilities
- Serve as expert in Data Science, build framework to develop Production level DS/AI models.
- Apply AI research and ML models to accelerate business innovation and solve impactful business problems for our clients.
- Lead multiple teams across clients ensuring quality and timely outcomes on all projects.
- Lead and manage the onsite-offshore relation, at the same time adding value to the client.
- Partner with business and technical stakeholders to translate challenging business problems into state-of-the-art data science solutions.
- Build a winning team focused on client success. Help team members build lasting career in data science and create a constant learning/development environment.
- Present results, insights, and recommendations to senior management with an emphasis on the business impact.
- Build engaging rapport with client leadership through relevant conversations and genuine business recommendations that impact the growth and profitability of the organization.
- Lead or contribute to org level initiatives to build the Tredence of tomorrow.
Qualification & Experience
- Bachelor's /Master's /PhD degree in a quantitative field (CS, Machine learning, Mathematics, Statistics, Data Science) or equivalent experience.
- 6-10+ years of experience in data science, building hands-on ML models
- Expertise in ML – Regression, Classification, Clustering, Time Series Modeling, Graph Network, Recommender System, Bayesian modeling, Deep learning, Computer Vision, NLP/NLU, Reinforcement learning, Federated Learning, Meta Learning.
- Proficient in some or all of the following techniques: Linear & Logistic Regression, Decision Trees, Random Forests, K-Nearest Neighbors, Support Vector Machines ANOVA , Principal Component Analysis, Gradient Boosted Trees, ANN, CNN, RNN, Transformers.
- Knowledge of programming languages SQL, Python/ R, Spark.
- Expertise in ML frameworks and libraries (TensorFlow, Keras, PyTorch).
- Experience with cloud computing services (AWS, GCP or Azure)
- Expert in Statistical Modelling & Algorithms E.g. Hypothesis testing, Sample size estimation, A/B testing
- Knowledge in Mathematical programming – Linear Programming, Mixed Integer Programming etc , Stochastic Modelling – Markov chains, Monte Carlo, Stochastic Simulation, Queuing Models.
- Experience with Optimization Solvers (Gurobi, Cplex) and Algebraic programming Languages(PulP)
- Knowledge in GPU code optimization, Spark MLlib Optimization.
- Familiarity to deploy and monitor ML models in production, delivering data products to end-users.
- Experience with ML CI/CD pipelines.
Experience Range |
2 Years - 10 Years |
Function | Information Technology |
Desired Skills |
Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
|
Education Type | Engineering |
Degree / Diploma | Bachelor of Engineering, Bachelor of Computer Applications, Any Engineering |
Specialization / Subject | Any Specialisation |
Job Type | Full Time |
Job ID | 000018 |
Department | Software Development |
- Experience with relational SQL & NoSQL databases including MySQL & MongoDB.
- Familiar with the basic principles of distributed computing and data modeling.
- Experience with distributed data pipeline frameworks like Celery, Apache Airflow, etc.
- Experience with NLP and NER models is a bonus.
- Experience building reusable code and libraries for future use.
- Experience building REST APIs.
Preference for candidates working in tech product companies
We are an early stage start-up, building new fintech products for small businesses. Founders are IIT-IIM alumni, with prior experience across management consulting, venture capital and fintech startups. We are driven by the vision to empower small business owners with technology and dramatically improve their access to financial services. To start with, we are building a simple, yet powerful solution to address a deep pain point for these owners: cash flow management. Over time, we will also add digital banking and 1-click financing to our suite of offerings.
We have developed an MVP which is being tested in the market. We have closed our seed funding from marquee global investors and are now actively building a world class tech team. We are a young, passionate team with a strong grip on this space and are looking to on-board enthusiastic, entrepreneurial individuals to partner with us in this exciting journey. We offer a high degree of autonomy, a collaborative fast-paced work environment and most importantly, a chance to create unparalleled impact using technology.
Reach out if you want to get in on the ground floor of something which can turbocharge SME banking in India!
Technology stack at Velocity comprises a wide variety of cutting edge technologies like, NodeJS, Ruby on Rails, Reactive Programming,, Kubernetes, AWS, NodeJS, Python, ReactJS, Redux (Saga) Redis, Lambda etc.
Key Responsibilities
-
Responsible for building data and analytical engineering pipelines with standard ELT patterns, implementing data compaction pipelines, data modelling and overseeing overall data quality
-
Work with the Office of the CTO as an active member of our architecture guild
-
Writing pipelines to consume the data from multiple sources
-
Writing a data transformation layer using DBT to transform millions of data into data warehouses.
-
Implement Data warehouse entities with common re-usable data model designs with automation and data quality capabilities
-
Identify downstream implications of data loads/migration (e.g., data quality, regulatory)
What To Bring
-
3+ years of software development experience, a startup experience is a plus.
-
Past experience of working with Airflow and DBT is preferred
-
2+ years of experience working in any backend programming language.
-
Strong first-hand experience with data pipelines and relational databases such as Oracle, Postgres, SQL Server or MySQL
-
Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development)
-
Experienced with the formulation of ideas; building proof-of-concept (POC) and converting them to production-ready projects
-
Experience building and deploying applications on on-premise and AWS or Google Cloud cloud-based infrastructure
-
Basic understanding of Kubernetes & docker is a must.
-
Experience in data processing (ETL, ELT) and/or cloud-based platforms
-
Working proficiency and communication skills in verbal and written English.
1. Expert in deep learning and machine learning techniques,
2. Extremely Good in image/video processing,
3. Have a Good understanding of Linear algebra, Optimization techniques, Statistics and pattern recognition.
Then u r the right fit for this position.
• Total of 4+ years of experience in development, architecting/designing and implementing Software solutions for enterprises.
• Must have strong programming experience in either Python or Java/J2EE.
• Minimum of 4+ year’s experience working with various Cloud platforms preferably Google Cloud Platform.
• Experience in Architecting and Designing solutions leveraging Google Cloud products such as Cloud BigQuery, Cloud DataFlow, Cloud Pub/Sub, Cloud BigTable and Tensorflow will be highly preferred.
• Presentation skills with a high degree of comfort speaking with management and developers
• The ability to work in a fast-paced, work environment
• Excellent communication, listening, and influencing skills
RESPONSIBILITIES:
• Lead teams to implement and deliver software solutions for Enterprises by understanding their requirements.
• Communicate efficiently and document the Architectural/Design decisions to customer stakeholders/subject matter experts.
• Opportunity to learn new products quickly and rapidly comprehend new technical areas – technical/functional and apply detailed and critical thinking to customer solutions.
• Implementing and optimizing cloud solutions for customers.
• Migration of Workloads from on-prem/other public clouds to Google Cloud Platform.
• Provide solutions to team members for complex scenarios.
• Promote good design and programming practices with various teams and subject matter experts.
• Ability to work on any product on the Google cloud platform.
• Must be hands-on and be able to write code as required.
• Ability to lead junior engineers and conduct code reviews
QUALIFICATION:
• Minimum B.Tech/B.E Engineering graduate
Your mission is to help lead team towards creating solutions that improve the way our business is run. Your knowledge of design, development, coding, testing and application programming will help your team raise their game, meeting your standards, as well as satisfying both business and functional requirements. Your expertise in various technology domains will be counted on to set strategic direction and solve complex and mission critical problems, internally and externally. Your quest to embracing leading-edge technologies and methodologies inspires your team to follow suit.
Responsibilities and Duties :
- As a Data Engineer you will be responsible for the development of data pipelines for numerous applications handling all kinds of data like structured, semi-structured &
unstructured. Having big data knowledge specially in Spark & Hive is highly preferred.
- Work in team and provide proactive technical oversight, advice development teams fostering re-use, design for scale, stability, and operational efficiency of data/analytical solutions
Education level :
- Bachelor's degree in Computer Science or equivalent
Experience :
- Minimum 5+ years relevant experience working on production grade projects experience in hands on, end to end software development
- Expertise in application, data and infrastructure architecture disciplines
- Expert designing data integrations using ETL and other data integration patterns
- Advanced knowledge of architecture, design and business processes
Proficiency in :
- Modern programming languages like Java, Python, Scala
- Big Data technologies Hadoop, Spark, HIVE, Kafka
- Writing decently optimized SQL queries
- Orchestration and deployment tools like Airflow & Jenkins for CI/CD (Optional)
- Responsible for design and development of integration solutions with Hadoop/HDFS, Real-Time Systems, Data Warehouses, and Analytics solutions
- Knowledge of system development lifecycle methodologies, such as waterfall and AGILE.
- An understanding of data architecture and modeling practices and concepts including entity-relationship diagrams, normalization, abstraction, denormalization, dimensional
modeling, and Meta data modeling practices.
- Experience generating physical data models and the associated DDL from logical data models.
- Experience developing data models for operational, transactional, and operational reporting, including the development of or interfacing with data analysis, data mapping,
and data rationalization artifacts.
- Experience enforcing data modeling standards and procedures.
- Knowledge of web technologies, application programming languages, OLTP/OLAP technologies, data strategy disciplines, relational databases, data warehouse development and Big Data solutions.
- Ability to work collaboratively in teams and develop meaningful relationships to achieve common goals
Skills :
Must Know :
- Core big-data concepts
- Spark - PySpark/Scala
- Data integration tool like Pentaho, Nifi, SSIS, etc (at least 1)
- Handling of various file formats
- Cloud platform - AWS/Azure/GCP
- Orchestration tool - Airflow