Job Description - Sr Azure Data Engineer
Roles & Responsibilities:
- Hands-on programming in C# / .Net,
- Develop serverless applications using Azure Function Apps.
- Writing complex SQL Queries, Stored procedures, and Views.
- Creating Data processing pipeline(s).
- Develop / Manage large-scale Data Warehousing and Data processing solutions.
- Provide clean, usable data and recommend data efficiency, quality, and data integrity.
- Should have working experience on C# /.Net.
- Proficient with writing SQL queries, Stored Procedures, and Views
- Should have worked on Azure Cloud Stack.
- Should have working experience ofin developing serverless code.
- Must have MANDATORILY worked on Azure Data Factory.
- 4+ years of relevant experience
About CoStrategix Technologies
Have 2 to 6 years of experience working in a similar role in a startup environment
SQL and Excel have no secrets for you
You love visualizing data with Tableau
Any experience with product analytics tools (Mixpanel, Clevertap) is a plus
You solve math puzzles for fun
A strong analytical mindset with a problem-solving attitude
Comfortable with being critical and speaking your mind
You can easily switch between coding (R or Python) and having a business
Be a team player who thrives in a fast-paced and constantly changing environment
- Interpret data, analyze results using statistical techniques and provide ongoing reports
- Develop and implement databases, data collection systems, data analytics, and other strategies that optimize statistical efficiency and quality
- Acquire data from primary or secondary data sources and maintain databases/data systems
- Identify, analyze, and interpret trends or patterns in complex data sets
- Filter and clean data by reviewing computer reports, printouts, and performance indicators to locate and correct code problems
- Work with the teams to prioritize business and information needs
- Locate and define new process improvement opportunities
- Minimum 1 year of working experience as a Data Analyst or Business Data Analyst
- Technical expertise with data models, database design development, data mining, and segmentation techniques
- Knowledge of statistics and experience using statistical packages for analyzing datasets (Excel, SPSS, SAS, etc)
- Strong analytical skills with the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy
- Excellent written and verbal communication skills for coordinating across teams.
- A drive to learn and master new technologies and techniques.
- Collaborate with the business teams to understand the data environment in the organization; develop and lead the Data Scientists team to test and scale new algorithms through pilots and subsequent scaling up of the solutions
- Influence, build and maintain the large-scale data infrastructure required for the AI projects, and integrate with external IT infrastructure/service
- Act as the single point source for all data related queries; strong understanding of internal and external data sources; provide inputs in deciding data-schemas
- Design, develop and maintain the framework for the analytics solutions pipeline
- Provide inputs to the organization’s initiatives on data quality and help implement frameworks and tools for the various related initiatives
- Work in cross-functional teams of software/machine learning engineers, data scientists, product managers, and others to build the AI ecosystem
- Collaborate with the external organizations including vendors, where required, in respect of all data-related queries as well as implementation initiatives
- Role: Machine Learning Lead
- Experience: 5+ Years
- Employee strength: 80+
- Remuneration: Most competitive in the market
• Advance knowledge of Python.
• Object Oriented Programming skills.
• Mathematical understanding of machine learning and deep learning algorithms.
• Thorough grasp on statistical terminologies.
• Libraries: Tensorflow, Keras, Pytorch, Statsmodels, Scikit-learn, SciPy, Numpy, Pandas, Matplotlib, Seaborn, Plotly
• Algorithms: Ensemble Algorithms, Artificial Neural Networks and Deep Learning, Clustering Algorithms, Decision Tree Algorithms, Dimensionality Reduction Algorithms, etc.
• MySQL, MongoDB, ElasticSearch or other NoSQL database implementations.
If interested kindly share your cv at tanya @tigihr. com
SteelEye is the only regulatory compliance technology and data analytics firm that offers transaction reporting, record keeping, trade reconstruction, best execution and data insight in one comprehensive solution. The firm’s scalable secure data storage platform offers encryption at rest and in flight and best-in-class analytics to help financial firms meet regulatory obligations and gain competitive advantage.
The company has a highly experienced management team and a strong board, who have decades of technology and management experience and worked in senior positions at many leading international financial businesses. We are a young company that shares a commitment to learning, being smart, working hard and being honest in all we do and striving to do that better each day. We value all our colleagues equally and everyone should feel able to speak up, propose an idea, point out a mistake and feel safe, happy and be themselves at work.
Being part of a start-up can be equally exciting as it is challenging. You will be part of the SteelEye team not just because of your talent but also because of your entrepreneurial flare which we thrive on at SteelEye. This means we want you to be curious, contribute, ask questions and share ideas. We encourage you to get involved in helping shape our business. What you'll do
What you will do?
- Deliver plugins for our python based ETL pipelines.
- Deliver python services for provisioning and managing cloud infrastructure.
- Design, Develop, Unit Test, and Support code in production.
- Deal with challenges associated with large volumes of data.
- Manage expectations with internal stakeholders and context switch between multiple deliverables as priorities change.
- Thrive in an environment that uses AWS and Elasticsearch extensively.
- Keep abreast of technology and contribute to the evolution of the product.
- Champion best practices and provide mentorship.
What we're looking for
- Python 3.
- Python libraries used for data (such as pandas, numpy).
- Performance tuning.
- Object Oriented Design and Modelling.
- Delivering complex software, ideally in a FinTech setting.
- CI/CD tools.
- Knowledge of design patterns.
- Sharp analytical and problem-solving skills.
- Strong sense of ownership.
- Demonstrable desire to learn and grow.
- Excellent written and oral communication skills.
- Mature collaboration and mentoring abilities.
What will you get?
- This is an individual contributor role. So, if you are someone who loves to code and solve complex problems and build amazing products and not worry about anything else, this is the role for you.
- You will have the chance to learn from the best in the business who have worked across the world and are technology geeks.
- Company that always appreciates ownership and initiative. If you are someone who is full of ideas, this role is for you.
Cloudera Data Warehouse Hive team looking for a passionate senior developer to join our growing engineering team. This group is targeting the biggest enterprises wanting to utilize Cloudera’s services in a private and public cloud environment. Our product is built on open source technologies like Hive, Impala, Hadoop, Kudu, Spark and so many more providing unlimited learning opportunities.A Day in the LifeOver the past 10+ years, Cloudera has experienced tremendous growth making us the leading contributor to Big Data platforms and ecosystems and a leading provider for enterprise solutions based on Apache Hadoop. You will work with some of the best engineers in the industry who are tackling challenges that will continue to shape the Big Data revolution. We foster an engaging, supportive, and productive work environment where you can do your best work. The team culture values engineering excellence, technical depth, grassroots innovation, teamwork, and collaboration.
You will manage product development for our CDP components, develop engineering tools and scalable services to enable efficient development, testing, and release operations. You will be immersed in many exciting, cutting-edge technologies and projects, including collaboration with developers, testers, product, field engineers, and our external partners, both software and hardware vendors.Opportunity:Cloudera is a leader in the fast-growing big data platforms market. This is a rare chance to make a name for yourself in the industry and in the Open Source world. The candidate will responsible for Apache Hive and CDW projects. We are looking for a candidate who would like to work on these projects upstream and downstream. If you are curious about the project and code quality you can check the project and the code at the following link. You can start the development before you join. This is one of the beauties of the OSS world.Apache Hive
•Build robust and scalable data infrastructure software
•Design and create services and system architecture for your projects
•Improve code quality through writing unit tests, automation, and code reviews
•The candidate would write Java code and/or build several services in the Cloudera Data Warehouse.
•Worked with a team of engineers who reviewed each other's code/designs and held each other to an extremely high bar for the quality of code/designs
•The candidate has to understand the basics of Kubernetes.
•Build out the production and test infrastructure.
•Develop automation frameworks to reproduce issues and prevent regressions.
•Work closely with other developers providing services to our system.
•Help to analyze and to understand how customers use the product and improve it where necessary.
•Deep familiarity with Java programming language.
•Hands-on experience with distributed systems.
•Knowledge of database concepts, RDBMS internals.
•Knowledge of the Hadoop stack, containers, or Kubernetes is a strong plus.
•Has experience working in a distributed team.
•Has 3+ years of experience in software development.
Azure – Data Engineer
- At least 2 years hands on experience working with an Agile data engineering team working on big data pipelines using Azure in a commercial environment.
- Dealing with senior stakeholders/leadership
- Understanding of Azure data security and encryption best practices. [ADFS/ACLs]
Data Bricks –experience writing in and using data bricks Using Python to transform, manipulate data.
Data Factory – experience using data factory in an enterprise solution to build data pipelines. Experience calling rest APIs.
Synapse/data warehouse – experience using synapse/data warehouse to present data securely and to build & manage data models.
Microsoft SQL server – We’d expect the candidate to have come from a SQL/Data background and progressed into Azure
PowerBI – Experience with this is preferred
- Experience using GIT as a source control system
- Understanding of DevOps concepts and application
- Understanding of Azure Cloud costs/management and running platforms efficiently
● Frame ML / AI use cases that can improve the company’s product
● Implement and develop ML / AI / Data driven rule based algorithms as software items
● For example, building a chatbot that replies an answer from relevant FAQ, and
reinforcing the system with a feedback loop so that the bot improves
Must have skills:
● Data extraction and ETL
● Python (numpy, pandas, comfortable with OOP)
● Knowledge of basic Machine Learning / Deep Learning / AI algorithms and ability to
● Good understanding of SDLC
● Deployed ML / AI model in a mobile / web product
● Soft skills : Strong communication skills & Critical thinking ability
Good to have:
● Full stack development experience
B.Tech. / B.E. degree in Computer Science or equivalent software engineering
Roles & Responsibilities
- Proven experience with deploying and tuning Open Source components into enterprise ready production tooling Experience with datacentre (Metal as a Service – MAAS) and cloud deployment technologies (AWS or GCP Architect certificates required)
- Deep understanding of Linux from kernel mechanisms through user space management
- Experience on CI/CD (Continuous Integrations and Deployment) system solutions (Jenkins).
- Using Monitoring tools (local and on public cloud platforms) Nagios, Prometheus, Sensu, ELK, Cloud Watch, Splunk, New Relic etc. to trigger instant alerts, reports and dashboards. Work closely with the development and infrastructure teams to analyze and design solutions with four nines (99.99%) up-time, globally distributed, clustered, production and non-production virtualized infrastructure.
- Wide understanding of IP networking as well as data centre infrastructure
- Expert with software development tools and sourcecode management, understanding, managing issues, code changes and grouping them into deployment releases in a stable and measurable way to maximize production Must be expert at developing and using ansible roles and configuring deployment templates with jinja2.
- Solid understanding of data collection tools like Flume, Filebeat, Metricbeat, JMX Exporter agents.
- Extensive experience operating and tuning the kafka streaming data platform, specifically as a message queue for big data processing
- Strong understanding and must have experience:
- Apache spark framework, specifically spark core and spark streaming,
- Orchestration platforms, mesos and kubernetes,
- Data storage platforms, elasticstack, carbon, clickhouse, cassandra, ceph, hdfs
- Core presentation technologies kibana, and grafana.
- Excellent scripting and programming skills (bash, python, java, go, rust). Must have previous experience with “rust” in order to support, improve in house developed products
Red Hat Certified Architect certificate or equivalent required CCNA certificate required 3-5 years of experience running open source big data platforms