GE Digital connects streams of machine data to powerful analytics & people. Find out about our people & how we're bringing the Industrial Internet to life.
Responsibilities Exp 3~5 years Build up a strong and scalable crawler system for leveraging external user & content data source from Facebook, Youtube and others internet products or service. Getting top trending keywords & topic from social media. Design and build initial version of the real-time analytics product from Machine Learning Models to recommend video contents in real time to 10M+ User Profiles independently. Architect and build Big Data infrastructures using Java, Kafka, Storm, Hadoop, Spark and other related frameworks, experience with Elastic search is a plus Excellent Analytical, Research and Problem Solving skills, in-depth knowledge of Data Structure Desired Skills and Experience B.S./M.S. degree in computer science, mathematics, statistics or a similar quantitative field with good college background 3+ years of work experience in relevant field (Data Engineer, R&D engineer, etc) Experience in Machine Learning and Prediction & Recommendation techniques Experience with Hadoop/MapReduce/Elastic-Stack/ELK and Big Data querying tools, such as Pig, Hive, and Impala Proficiency in a major programming language (e.g. Java/C/Scala) and/or a scripting language (Python) Experience with one or more NoSQL databases, such as MongoDB, Cassandra, HBase, Hive, Vertica, Elastic Search Experience with cloud solutions/AWS, strong knowledge in Linux and Apache Experience with any map-reduce SPARK/EMR Experience in building reports and/or data visualization Strong communication skills and ability to discuss the product with PMs and business owners
Looking for a technically sound and excellent trainer on big data technologies. Get an opportunity to become popular in the industry and get visibility. Host regular sessions on Big data related technologies and get paid to learn.
PlumSlice Labs is an enterprise technology solutions and services company that specializes in innovative enterprise software-as-service products in support of collaborative global commerce. Our solutions are developed using Agile Software Engineering Practices and cloud computing to better enable organizations seeking to build competitive advantage by adopting industry best practices and accelerated implementation. Our current and in-development SaaS offerings include product lifecycle and information management, global commerce and supply chain management. We also provide companies with unmatched IT services, including IT strategy, Application Services, Ecommerce Technology, Supply Chain Systems and Business Intelligence. PlumSlice Labs is headquartered at San Francisco, California. The company was founded by a team of talented individuals whose experience includes a successful track record of working with companies like Best Buy, Visa, IBM, William Sonoma, Manhattan Associates, Restoration Hardware, Starbucks, Infosys, Stubhub.com and more. The single mission is to help companies solve their biggest technology challenges through cost-effective, on demand solutions.
Data Scientist - We are looking for a candidate to build great recommendation engines and power an intelligent m.Paani user journey Responsibilities : - Data Mining using methods like associations, correlations, inferences, clustering, graph analysis etc. - Scale machine learning algorithm that powers our platform to support our growing customer base and increasing data volume - Design and implement machine learning, information extraction, probabilistic matching algorithms and models - Care about designing the full machine learning pipeline. - Extending company's data with 3rd party sources. - Enhancing data collection procedures. - Processing, cleaning and verifying data collected. - Ad hoc analysis of the data and present clear results. - Creating advanced analytics products that provide actionable insights. The Individual : - We are looking for a candidate with the following skills, experience and attributes: Required : - Someone with 2+ years of work experience in machine learning. - Educational qualification relevant to the role. Degree in Statistics, certificate courses in Big Data, Machine Learning etc. - Knowledge of Machine Learning techniques and algorithms. - Knowledge in languages and toolkits like Python, R, Numpy. - Knowledge of data visualization tools like D3,js, ggplot2. - Knowledge of query languages like SQL, Hive, Pig . - Familiar with Big Data architecture and tools like Hadoop, Spark, Map Reduce. - Familiar with NoSQL databases like MongoDB, Cassandra, HBase. - Good applied statistics skills like distributions, statistical testing, regression etc. Compensation & Logistics : This is a full-time opportunity. Compensation will be in line with startup, and will be based on qualifications and experience. The position is based in Mumbai, India, and the candidate must live in Mumbai or be willing to relocate.
candidate will be responsible for all aspects of data acquisition, data transformation, and analytics scheduling and operationalization to drive high-visibility, cross-division outcomes. Expected deliverables will include the development of Big Data ELT jobs using a mix of technologies, stitching together complex and seemingly unrelated data sets for mass consumption, and automating and scaling analytics into the GRAND's Data Lake. Key Responsibilities : - Create a GRAND Data Lake and Warehouse which pools all the data from different regions and stores of GRAND in GCC - Ensure Source Data Quality Measurement, enrichment and reporting of Data Quality - Manage All ETL and Data Model Update Routines - Integrate new data sources into DWH - Manage DWH Cloud (AWS/AZURE/Google) and Infrastructure Skills Needed : - Very strong in SQL. Demonstrated experience with RDBMS, Unix Shell scripting preferred (e.g., SQL, Postgres, Mongo DB etc) - Experience with UNIX and comfortable working with the shell (bash or KRON preferred) - Good understanding of Data warehousing concepts. Big data systems : Hadoop, NoSQL, HBase, HDFS, MapReduce - Aligning with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop and to expand existing environments. - Working with data delivery teams to set up new Hadoop users. This job includes setting up Linux users, setting up and testing HDFS, Hive, Pig and MapReduce access for the new users. - Cluster maintenance as well as creation and removal of nodes using tools like Ganglia, Nagios, Cloudera Manager Enterprise, and other tools. - Performance tuning of Hadoop clusters and Hadoop MapReduce routines. - Screen Hadoop cluster job performances and capacity planning - Monitor Hadoop cluster connectivity and security - File system management and monitoring. - HDFS support and maintenance. - Collaborating with application teams to install operating system and - Hadoop updates, patches, version upgrades when required. - Defines, develops, documents and maintains Hive based ETL mappings and scripts
It is one of the largest communication technology companies in the world. They operate America's largest 4G LTE wireless network and the nation's premiere all-fiber broadband network.
Position:-Data Scientist Location :- Gurgaon Job description Shopclues is looking for talented Data Scientist passionate about building large scale data processing systems to help manage the ever-growing information needs of our clients. Education : PhD/MS or equivalent in Applied mathematics, statistics, physics, computer science or operations research background. 2+ years experience in a relevant role. Skills · Passion for understanding business problems and trying to address them by leveraging data - characterized by high-volume, high dimensionality from multiple sources · Ability to communicate complex models and analysis in a clear and precise manner · Experience with building predictive statistical, behavioural or other models via supervised and unsupervised machine learning, statistical analysis, and other predictive modeling techniques. · Experience using R, SAS, Matlab or equivalent statistical/data analysis tools. Ability to transfer that knowledge to different tools · Experience with matrices, distributions and probability · Familiarity with at least one scripting language - Python/Ruby · Proficiency with relational databases and SQL Responsibilities · Has worked in a big data environment before alongside a big data engineering team (and data visualization team, data and business analysts) · Translate client's business requirements into a set of analytical models · Perform data analysis (with a representative sample data slice) and build/prototype the model(s) · Provide inputs to the data ingestion/engineering team on input data required by the model, size, format, associations, cleansing required
Securonix is a security analytics product company. Our product provides real-time behavior analytics capabilities and uses the following Hadoop components - Kafka, Spark, Impala, HBase. We support very large customers for all our customers globally, with full access to the cluster. Cloudera Certification is a big plus.
Securonix is a Big Data Security Analytics product company. The only product which delivers real-time behavior analytics (UEBA) on Big Data.