Ocado van

The story of how to quickly check your team’s practices while having a lot of fun.

How do firefighters assess their readiness for an emergency situation? How do they ensure that everyone knows what to do in case of a fire?

The best way to check this is to organise maneuvers similar to a real life situation.

Firefighters train regularly to check and improve their procedures. They are inducting new members, they are testing new equipment, and they are building trust to each other.
Nothing can teach you more than real practice.

Like firefighters, data scientists need to perform as a team. This means introducing ways of working to new joiners, making use of the best tools and techniques they have at their disposal, and knowing each other’s skills and personalities.

 

In this article we would like to describe how we organised an internal Kaggle-like competition to test and assess our data science procedures.

Our small data science competition

Recently, we organised a machine learning competition at Ocado Technology to check how well equipped our data science teams were to solve real problems under pressure. We invited data scientists from our five offices (i.e. Kraków, Wrocław, Sofia, Barcelona, and Hatfield) to our headquarters in the UK. We ordered some pizzas and started a hackday.

We formed into teams and adopted only one rule: if two data scientists are working on the same team in real life, they cannot work with each other during the hackday. We wanted to encourage people to get to know each other.

We decided to use a Kaggle style competition. For people who are not familiar with Kaggle, it’s a competition where business problems, data, and evaluation metrics are defined by the organisers. Participants then have to build ‘only’ the corresponding machine learning models.

Our problem

The goal of the competition was to predict the total time-at-door for Ocado delivery vans.  

We wanted to know how much time it would take to deliver groceries to a particular address. Ocado is using these delivery times to plan the van routes with more certainty and thereby open up more one-hour time slots for customers to choose from.

Ocado vans

Similar to Kaggle, we prepared some baselines and a leaderboard where we showed the best solutions; this gave participants additional motivation to build something better than everyone else. At the end of the competition, the teams presented their findings and models. We learned a lot about our data, our practices, and ourselves.

It was a great day so we wrapped things up with the customary pint in the pub.

Five lessons learned after the Kaggle competition at Ocado

You can easily apply these lessons in your data science team or data department:

 

    1. Hackdays are a great chance to socialise

There’s nothing like a competition to get people from different offices working together and therefore getting to know each other. People can learn about their strengths and weaknesses. We found problem-solving to be a great team-building exercise. After the event, we created a survey which confirmed that people indeed had a lot of fun.

    2. Machine learning models are only the tip of the iceberg

As an organiser, you need to choose the problem wisely: it cannot be too difficult to solve in one day but still should be challenging. You have to define the evaluation metrics, gather data, split it into training and test sets, write down the rules etc. During his presentation at NIPS 2016, Ben Hamner (CTO of Kaggle) confirmed that his employees invest hundreds of hours in properly setting up the competitions behind the scenes. In all data science projects, only 5-10% of the time is spent on modeling.

    3. Data science is all about iterations

During the competition, some teams over-complicated their models: they tried to check too many things at the same time and overestimated what is feasible to do during one day. At the end of the day, only working models really matter (all teams had plenty of ideas on what they would have liked to check but ran out of time).

It works pretty similarly in real life. We’ve written about this here as well.

Practice and contests like this can show your team the benefit of iterative work.

    4. Domain knowledge can make all the difference

Rather than trying a more complicated model, it’s better to first invest energy into understanding the metrics, analysing data, checking the distribution and outliers. The team that won the competition used their knowledge about Ocado’s business to improve their model. In real life, very often domain knowledge is essential.

    5. Improve your engineering practices

Python and R are two of the most popular programming languages for data scientists.

To work effectively, you need to know your tools very well, including programming languages and frameworks. If you want to rapidly check hypotheses or add new variables, you cannot be blocked by technologies.

This hackday showed us that we need to work harder on unifying our technology stack and adjusting the induction process to ensure that everyone can easily get the data, make the analysis or model and share their results with the rest of the team.

 

As we’ve seen, a one day hackday event can provide a very useful health check for your team. You can check how people organise their work, what tools they are using and how they are working to solve problems. But hackdays can be beneficial not only for data science or engineering teams; management teams can use them to decide training budgets, investment in tools and technology, or for forming new teams. We therefore strongly encourage you to involve your managers or team leaders in these events as much as possible.

Try to conduct similar competitions in your company. We assure you that you will learn much more than you’re expecting while having a lot of fun.

Lukas Innig, Marcin Druzkowski

May 18th, 2017

Posted In: Blog

Tags: , , , ,

Cloud computing model

Lessons learned from deploying an NLP project in production for the Ocado contact centre.

 

A few months ago, the data science team at Ocado Technology embarked on a project to categorise and prioritise customer emails coming into the Ocado contact centre. You can read more about the history of our project in this post: Building ML model is hard. Deploying into real business is even harder.

Today we’d like to offer other engineers and scientists some useful tips and lessons we learned along the way after deploying our machine learning (ML) project into production.

Tip 1: Understand the domain of the problem

At the beginning of every machine learning project, you need to sit down with your business stakeholders and understand what they are trying to achieve.

It will be very hard to build a properly working solution without understanding the scope of the problem, and you may even find that an ML model is the wrong solution to the problem they are trying to solve.

Talk with your business colleagues and don’t be afraid to provide open feedback.

Tip 2: Define your success metrics

After you familiarize yourself with the domain of the problem, you should then discuss how to measure the success of the project. A good idea is to come up with two different sets of metrics. You can start with the business related metrics, such as achieving a financial gain and/or improving workers’ productivity. Secondly, think about machine learning related metrics that help you build and validate models properly. Try to think what is the relationship between these two indicators.

Tip 3: Prepare for change

When you are building your machine learning project, you need to take into account that the business will change in the future. Priorities will change, problems will change. Everything flows.

Try to build a flexible solution and let the business decide what happens next. Be agile.

Tip 4: Don’t forget about security and legal obligations

All machine learning models use some data under the hood. It’s your responsibility to keep and process this data safely and according to the law, especially if you deal with confidential data like customer addresses and emails.

Tip 5: Enrich your data

Data quality has a huge impact on the final accuracy of your model. Invest money and time to gather high quality data. Think how you can enrich your dataset.

In our case, our model used the Wikipedia corpus to learn the English language and initialize the embedding layer.

Tip 6: Create a simple model first.

Unfortunately, all data scientists overcomplicate their models.

It’s much better to build something simple that works; so always start with the simplest model. Simple models are easier to debug, easier to explain and easier to deploy.

It might be that you will not need a sophisticated model at all.

Tip 7: People do not trust machine learning, people trust other people

To build trust, be honest about your model’s accuracy. Inform others what are the limits of your model and what can be done better in the future. Transparency is the easiest way to build trust.

Tip 8: Treat your ML project like any other software project

Write tests. Make code reviews. Manage technical debt. All software engineering standards should be met. Your machine learning model is just software. No excuses.

Tip 9: Deployment to production is not the end of the project. It’s the beginning of giving value.

Ensure that you can easily answer the questions below:

1) Who will support and maintain your model? 2) What is the procedure in case of an emergency? 3) Based on which dashboard or monitoring can you assess the current quality of model?

Tip 10: If nine tips are not enough…

Take a look at these 43 rules of Machine Learning Engineering from Google.

Marcin Druzkowski

April 10th, 2017

Posted In: Blog

Tags: , , , ,

Contact Centre

From idea to production system – the story of how an NLP project in the Ocado contact centre improved reply times by up to 4x. Also, ten tips for other Data Science teams.

 

A few months ago, we described on our blog how machine learning (ML) improved efficiency in our contact centre. Today we would like to tell you how we built this system, what we have learned along the way, and how we were able to reduce response times for customer emails by up to 4x.

Email dump cartoon

Presenting the problem

Imagine that you are a manager of a sizeable contact center that is getting a few thousand customer emails on a daily basis. Your customers typically contact you about very different things. For example:

  • John wants to give feedback about how polite his driver was
  • Matthew asks for a refund because his product was damaged
  • Alice informs us that she isn’t at home so the delivery won’t be successful
  • Jane want to thank-you for her great first delivery

As a manager you need to decide: 1. How long can an email wait in the queue without a response? 2. Is Alice’s request more important than John’s feedback ?

There are no easy answers for these questions. All contact center managers need to deal with these problems and Ocado is no exception.

Offering a solution

Imagine that you have a system that assigns appropriate tags based on the content of an email like in the example below:

Customer service email quote

Later, another function determines the priority of that email (and how quickly you should react) based on tags returned from the machine learning model. In our contact center, the tag cloud included labels such as Feedback, Food issue, Spam, Damaged item, Voucher, Quality and a few others.

You may wonder why we have split this process into two steps, rather than classify priority directly?

This was one of our lessons learned. When you are building machine learning models for a real business, you need to take into account that the business will change, priorities will shift, and incorporating these variables into your model is always a bad idea. To be agile, you need to give your business a lot of flexibility.

Here is an example of how assigning priorities would work:

Chart assigning email prioties

In our proposed solution, the contact center manager can decide that emails tagged “Thank you” (generally sent by happy customers) are not as important as “Payment issue”-type emails which must be answered in a matter of minutes.

First things first

Before we started gathering data, we wanted to ensure that we all understood the domain of the problem correctly. Nothing beats hands-on experience so we switched off our computers and spent a day in the contact center to understand what work there really looks like. That experience was funny and very useful in hindsight; it helped us build relationships with many colleagues we hadn’t interacted with before and visualize their problems in greater detail.

To determine the success of this project, we defined a clear business goal: to minimize the amount of time which urgent emails need to wait in the queue before receiving a response.

At the end of project we wanted to see the following pattern appear on the contact center dashboard.

Graph showing decline in email queues

From a machine learning perspective, this problem is a classic multilabel text classification. In multilabel problems, evaluating solutions quickly often implies computing a single aggregate measure that combines the measures for individual labels. We decided to use the commonly known F1 score, apply it to every label and average the results (this approach is known as macro averaging).

The dataset

Ocado maintains a large dataset of inbound emails that has been manually categorised by our contact centre advisors over the course of several years; this gave us over one million training examples for our multilabel classification. We couldn’t use the data in its raw format, however; some emails contained confidential data like phone numbers, postal or email addresses and customer names. Before we did anything with the data, we had to anonymise it. The process of deleting personal information is a very complex task and could be the topic of a standalone blog post.

Building the machine learning model

Before building any machine learning model, it’s always worth creating a simple heuristic baseline to benchmark against. With our particular problem, we had a set of 19 sparsely distributed tags; if we always choose only the most common label or predict at random, our F1 score will be around 0.05.

We started the modeling phase with a Logistic Regression model on a Bag of Words representation of the data. This simple solution achieved an F1 score close to 0.35 and helped us ensure that all parts of the system worked properly so that we could later focus purely on improving the accuracy of the model. A neural network was an obvious choice to accomplish this. We decided to evaluate two different neural net architectures: the Convolutional Neural Network (CNN) and the Recurrent Neural Network. We found recurrent architectures such as GRUs and LSTMs harder to train and very close in terms of performance to CNNs (but not better). Although a bit surprising, our findings are probably a reflection on the simplicity of our problem: usually each tag is associated with a presence or absence of some particular phrases so we don’t especially need to learn long-term dependencies like LSTMs do.

Below you can find the structure of our neural network which consists of a word embedding layer, two parallel convolutional layers, and a max pooling over the entire text followed by two fully connected layers; for each layer we applied batch normalization. In order to speed up the training we used word2vec embeddings as an initialization to our word embedding layer.

Structure of neural network

The whole architecture is surprisingly shallow. It was trained with a sigmoid cross entropy loss for around 20 epochs over our dataset and gives a production performance of around 0.8 f-macro.

You can read more about text classification from the following list of useful papers:

Deploying the model into production

Many of recent papers, articles, blog posts on machine learning focus only on improving the accuracy of a model. It’s worth emphasizing that modeling is only one of many steps in a data science project, and there are other steps that are equally important for the project to be successful.

A model which does not work on production is worth nothing.

From the first day you embark on a data science project, you should think about how you will expose your model – the sooner, the better. There are many reasons why a project can fail during deployment into a production environment.

We found three top reasons why this might happen:

  • Using the wrong technologies
  • Forgetting about software engineering practices
  • The lack of monitoring and support

Using the wrong technologies

To be sure that the incorrect use of technology will not block your deployment, you need to choose your platforms and tools wisely. It’s worth using technology which can be easily moved between environments and modes (i.e the code remains the same during training, prediction and serving)

We have decided to build our models in TensorFlow and deploy them in Google Cloud Machine Learning. TensorFlow allows you to specify the architecture in a high-level Python API and have those models run on distributed computing systems, including GPUs. Google Cloud Machine Learning provides managed services that enable you to easily expose your ML model as a REST API.

TensorFlow logo

Forgetting about software engineering practices

When you focus on building the best machine learning model, it’s very easy to forget that you write normal code. There is no magic to this: software engineering best practices will help make your code easier to maintain. For a software engineer’s perspective on data science, please have a look at this presentation.

Monitoring and support

At Ocado, we believe that teams work better when they are self-sufficient (as they don’t need to wait for other teams). Thanks to technologies like TensorFlow and Google Cloud Machine Learning, data scientists can also write and support production code. We feel we have ownership of the whole solution i.e data-product, machine learning model, dashboards, alerting policies etc.

Dashboard screenshotA screenshot from the production dashboard built with Google Data Studio

Reaping the benefits

Thanks to this project, we were able to significantly boost the efficiency of the customer centre. For example, we found that 7% of all inbound messages did not require a reply; this meant that our customer service advisors could spend more time working on more high-priority tasks.

Because the machine learning model automatically categorises emails, we have access to information quicker than ever before and can react much faster to sudden spikes in customer issues.

The project has also had an impact on the overall customer experience: urgent emails are being responded even four times faster than before.

Final remarks

We would love to hear your feedback about this article and project. If you have any questions or comments, feel free to drop us a line on social media.

If you enjoyed this article, spread the love around:

Share on Twitter

Share on Facebook

Thank you!

Marcin Druzkowski

Maciej Mnich and other data scientists from Ocado Technology contributed to this article

April 10th, 2017

Posted In: Blog

Tags: , , , , , , , ,

Cloud image

Being the world’s largest online-only supermarket means Ocado eats big data for breakfast. Since its inception more than three years ago, the data team at Ocado Technology has been finding ever more efficient ways to manage Ocado’s digital footprint.

One way to achieve this goal was to be at the forefront of adopting cloud technologies. This article aims to offer a brief overview of how the data team tackled a major project to move all of Ocado’s on-premise data to the cloud. There have been several important lessons we’ve learned along the way and I’d like to use this opportunity to share a few of them with you.

 

'Growth in data' diagram

The main motivation for starting this project was threefold:

  • Reducing costs: the old, on-premise stack was expensive to upgrade and maintain
  • Gaining more performance: we were hoping to achieve more elastic scaling based on demand
  • Data centralisation: we wanted to remove siloing of data between different departments and business divisions.

The project was initially resourced using our own internal data team; we felt confident the team had the required skills to do an initial proof of concept. We then used a third party provider who adopted a rinse and repeat approach based on our work.

From the start, we had a clear idea of when we could declare the project completed: all data from our on-prem analytics databases had to be migrated into the cloud into Google Cloud Storage or, ideally, BigQuery. This target would allow us to further exploit technologies like DataProc or TensorFlow on Google Cloud Machine Learning. Throughout the migration project, we could also easily quantify the benefit this move to the cloud was bringing as the cost of work (the humans and the system) was very obvious.

'BigQuery performance' diagram

We found there was no need to involve other parts of the business initially, and treated the project as a fixed-scope piece of work. However, as it evolved, we reevaluated the possibility of getting other teams involved so we could have a more inclusive, business-wide approach once the technology was well understood.

The ultimate desire was to move this project into the product stream to support the parallel streaming of data into the cloud. The prioritisation of these streams was handled by a product owner who also engaged with a steering group that took into account the current business needs.

We also set up a data curation team that would help business owners classify their data and land it in appropriate storage areas with correct access levels/retention, especially with Privacy Shield and GDPR. The data curation team also worked with the other teams to define the meaning of the data and create a set of business definitions.

Moving data around is not difficult, but assuring its quality is. How could we convince our stakeholders that the data in the cloud was indeed the same as that which they trusted on-premise? When it came to the quality of data, we implemented QA in several ways:

  • We validated that the source database and the cloud were in alignment.
  • Only certain data stores were classified as clean and assured
  • Data sources were prepared in Tableau to expose clean data
  • Those sources were validated with business users as they landed so that issues could be identified

At the end of the project, we were able to develop a series of processes that were production ready and supported through our technology teams.

Since adopting the Google Cloud Platform, we’ve reduced storage costs to a tenth, increased our storage capacity over twenty times and improved performance by hundreds of times compared to our previous approach of hosting data on-premise. Furthermore our development cycles on the data in the cloud has been significantly reduced as we implemented on demand computation power which allows us to experiment and iterate with much less latency and friction. Our initial results show how a cloud-first strategy can really bring benefits to the business, and we look forward to working with other like-minded retailers through our cloud-based Ocado Smart Platform.

To learn more about how Ocado Technology adopted BigQuery and other Google Cloud services, please register for this webcast.

Dan Nelson, Head of Data

March 28th, 2017

Posted In: Blog

Tags: , , , , , , ,

Contact Centre Agent

Being the world’s largest online-only grocery supermarket with over 500,000 active customers means we get the opportunity to interact with people all across the UK on a daily basis. Ocado prides itself on offering the best customer service in the industry which is one of the many reasons why our customers keep coming back.

Since Ocado doesn’t have physical stores, there are mainly two ways our customers and our employees interact directly. The first (and probably most common) is when our drivers deliver the groceries to the customers’ doorsteps; the second is when customers call or email us using our contact center based in the UK.

Today we’re going to tell you a bit more about how a customer contact center works and how Ocado is making it smarter.

The customer contact center

On the surface, Ocado operates the kind of contact center most people are already familiar with; we provide several ways for our customers to get in touch, including social media, a UK landline number, and a contact email.

Contact Centre

Customers can email, tweet or call Ocado

When it comes to emails, we get quite a variety of messages: from general feedback and redelivery requests to refund claims, payment or website issues – and even new product inquiries.

Getting in touch with a company can sometimes feel cumbersome. To make the whole process nice and easy for our customers, we don’t ask them to fill in any forms or self-categorise their emails. Instead, all messages gets delivered into a centralised mailbox no matter what they contain.

Contact Centre

Ocado customer service representatives filtering customer emails

However, a quick analysis of the classes of emails mentioned above reveals that not all of them should be treated with the same priority. In an old-fashioned contact centre, each email would be read and categorised by one of the customer service representatives and then passed on to the relevant department.

This model has a few major flaws: if the business starts scaling up quickly, customer service representatives may find it challenging to keep up, leading to longer delays which will anger customers. In addition, sifting through emails is a very repetitive task that often causes frustration for contact centre workers.

Clearly there must be a better way!

Machine learning to the rescue

Unbeknownst to many, Ocado has a technology division of 1000+ developers, engineers, researchers and scientists working hard to build an optimal technology infrastructure that revolutionises the way people shop online. This division is called Ocado Technology and includes a data science team that constantly finds new ways to apply machine learning and AI techniques to improve the processes related to running retail operations and beyond.

After analysing the latest research on the topic, the data science team discovered that machine learning algorithms can be adapted to help customer centres cope with vast amounts of emails.

The diagram below shows how we created our AI-based software application that helps our customer service team sort through the emails they receive daily.

Cloud computing model

The new AI-enhanced contact centre at Ocado

One of the fields related to machine learning is natural language processing (NLP), a discipline that combines computer science, artificial intelligence, and computational linguistics to create a link between computers and humans. Let’s use an email from a recent customer as an example to understand how we’ve deployed machine learning and NLP in our contact centres:

Example of feedback

The machine learning model identifies that the email contains general feedback and that the customer is happy

The software solution we’ve built parses through the body of the email and creates tags that help contact cenre workers determine the priority of each email. In our example, there is no immediate need for a representative to get in touch; the customer is satisfied with their order and has written a message thanking Ocado for their service.

We strive to deliver the best shopping experience for all our 500,000 + active customers. However, working in an omni channel contact centre can be challenging, with the team receiving thousands of contacts each day via telephone, email, webchat, social media and SMS. The new software developed by the Ocado Technology data science team will help the contact centre filter inbound customer contacts faster, enabling a quicker response to our customers which in turn will increase customer satisfaction levels. – Debbie Wilson, contact centre operations manager

In the case of a customer raising an issue about an order, the system detects that a representative needs to reply to the message urgently and therefore assigns the appropriate tag and colour code.

Data science at Ocado, using Google Cloud Platform and TensorFlow

This new ML-enhanced contact centre demonstrates how Ocado is using the latest technologies to make online shopping better for everyone.

Ocado was able to successfully deploy this new product in record time as a result of the close collaboration between three departments: data science, contact centre systems, and quality and development. Working together allowed us to share data and update models quickly, which we could then deploy in a real-world environment. Unlike a scientific demonstration where you’re usually working with a known set of quantities, the contact centre provided a much more dynamic scenario, with new data arriving constantly. – Pawel Domagala, product owner, last mile systems

Our in-house team of data scientists (check out our job openings here) trained the machine learning model on a large set of past emails. During the research phase, the team compared different architectures to find a suitable solution: convolutional neural networks (CNNs), long short term memory networks (LSTMs) and others. Once the software architecture was created, the model were then implemented using the TensorFlow library and the Python programming language.

TensorFlow and Python logos

Python is the de-facto most popular programming language in the data science community and provides the syntax simplicity and expressiveness capabilities we were looking for.

TensorFlow is a popular open-source machine learning toolkit that scales from research to production. TensorFlow is built around data flow graphs that can easily be constructed in Python, but the underlying computation is handled in C++ which makes it extremely fast.

We’re thrilled that TensorFlow helped Ocado adapt and extend state-of- the-art machine learning techniques to communicate more responsively with their customers. With a combination of open-source TensorFlow and Google Cloud services, Ocado and other leading companies can develop and deploy advanced machine learning solutions more rapidly than ever before. – Zak Stone, Product Manager for TensorFlow on the Google Brain Team

Understanding natural language is a particularly hard problem for computers. To overcome this obstacle, data scientists need access to large amount of computational resources and well-defined APIs for natural language processing. Thanks to the Google Cloud Platform, Ocado was able to use the power of cloud computing and train our models in parallel. Furthermore, Ocado has been an early adopter of Google Cloud Machine Learning (now available to all businesses in public beta) as well as the Cloud Natural Language API.

Google Cloud Platform logo

If you want to learn more about the technologies presented above, check out this presentation from Marcin Druzkowski, senior software engineer at Ocado Technology.

Make sure you also have a look at our Ocado Smart Platform for an overview of how Ocado is changing the game for online shopping and beyond.

October 13th, 2016

Posted In: Blog

Tags: , , , , , , , , ,

Cloud computing model

World’s largest online-only grocery retailer harnesses the power of AI and the Google Cloud Platform to categorize and prioritize customer emails

Ocado announces the deployment of its machine learning (ML)-enhanced contact center which employs an advanced AI (artificial intelligence) software model to categorize customer emails.

This approach ensures customers are still getting that familiar human touch while also benefiting from the quick response provided by technology automation. From the contact center point of view, the customer service representatives don’t have to spend hours categorizing thousands of emails manually; instead, the AI model parses the email and provides a useful summary and a priority tag. The customer service representative can then focus on solving the customers’ problems in a timely manner.

We strive to deliver the best shopping experience for all our 500,000+ active customers. However, working in an omni channel contact centre can be challenging, with the team receiving thousands of contacts each day via telephone, email, webchat, social media and SMS. The new software developed by the Ocado Technology data science team will help the contact centre filter inbound customer contacts faster, enabling a quicker response to our customers which in turn will increase customer satisfaction levels. – Debbie Wilson, Ocado contact centre operations manager
Thanks to a robust architecture, the software model can process thousands of customer emails per day and has been trained using millions of past messages from customers. In addition, the application respects customers’ privacy by filtering out personal details such as postal or email addresses, telephone numbers and other sensitive information.

The new ML-enhanced contact center application has been built using an in-house AI model and data sets created by Ocado Technology (the technology division of Ocado) as well as TensorFlow and related products from the Google Cloud Platform.

We’re thrilled that TensorFlow helped Ocado adapt and extend state-of- the-art machine learning techniques to communicate more responsively with their customers. With a combination of open-source TensorFlow and Google Cloud services, Ocado and other leading companies can develop and deploy advanced machine learning solutions more rapidly than ever before. – Zak Stone, Product Manager for TensorFlow on the Google Brain Team
Ocado is also one of the leading partners for the Google Cloud Platform and its Cloud Natural Language API.

About Ocado
Established in 2000, Ocado is a UK-based company admitted to trading on the London Stock Exchange (OCDO), and is the world’s largest dedicated online grocery retailer, operating its own grocery and general merchandise retail businesses under the Ocado.com and other specialist shop banners. For more information about the Ocado Group, visit www.ocadogroup.com

About Ocado Technology
Ocado Technology is a division of Ocado developing world-class systems and solutions in the areas of robotics, machine learning, simulation, data science, forecasting and routing, inference engines, big data, real-time control, and more. The fusion between the Ocado retail and Ocado Technology divisions creates a virtuous circle of innovation that leads to disruptive thinking. For more information about Ocado Technology, visit www.ocadotechnology.com

October 13th, 2016

Posted In: Press releases

Tags: , , , ,

Paul at the meetup

Last week Ocado Technology had the pleasure of being invited to speak at the Data Science Festival organised at Google’s London headquarters in Soho. I was very lucky to be among the 200+ participants in the audience and would like to share with you a few insights from the Data Science Festival meetup as well as some information about how Ocado Technology uses machine learning to improve customer service and the overall efficiency of our Customer Fulfilment Centres (CFCs).

The meetup began with an introduction from Binesh Lad, head of retail for Google Cloud Platform UK & Ireland at Google. He talked briefly about how Google is rapidly expanding its cloud offering, offering Coca Cola, Best Buy, CCP Games (makers of EVE Online) and others as examples of customers using the Google Cloud Platform.

Binesh then jokingly played a video that introduced Google’s new, very exciting and definitely real product: the Actual Cloud (an April Fool’s prank that went viral a few months ago).

The second speaker of the evening was Paul Clarke, CTO at Ocado Technology. Paul offered a few quick facts about Ocado and how we have made online grocery shopping a reality over the last decade.

He then gave a few examples of how IoT, robotics and machine learning can be used together to improve the efficiency of warehouse operations and route optimisation for vans. Everyone in the audience was blown away by a sequence of short clips showing robots roaming around our new automation-based CFC in Andover, a real-time visualisation of the CFC in Dordon, and a live map of the vans delivering orders to Ocado customers in the UK.

Slide on screen of the new warehouse grid

Paul then moved to the second part of his presentation where he outlined how IoT is an unstoppable force that will usher in the true democratisation of hardware and software. Ocado Technology is already working on several IoT-related projects and is constantly adopting new ways of thinking into its product development cycles based on the innovation that is spurring in the IoT community.

Closing the evening off was Marcin Druzkowski, senior software engineer at Ocado Technology.

Marcin offered his perspective on data science and how Ocado is applying software engineering principles like code versioning, code testing and review, and continuous improvement to machine learning.

Marcin on stage

He also provided some useful tips for TensorFlow developers and outlined tools such as Git, Docker, Jupyter used by his team when dealing with data science. Finally, Marcin offered an example of how Ocado Technology is using data science to analyze customer emails and improve its customer service by using machine learning.

Marcin presenting a slide on TensorFlow

After the event was over, I had the opportunity to chat with some of the people in the audience over beers and (free!) pizza. Many said it was definitely an amazing presentation (a few said it was one of the best data science meetups they’ve attended so far!) and were very excited to learn that Ocado Technology is a pioneer in machine learning and data science.

Alex Voica, Technology Communications Manager

 

September 1st, 2016

Posted In: Blog

Tags: , , , , , , , , ,

SecondHands

Recently we kicked off an exciting project to develop an autonomous humanoid robot. It will use artificial intelligence, machine learning and advanced vision systems to understand what human workers want, in order to offer assistance.

For example, it will be able to hand tools to maintenance technicians, and manipulate objects like ladders, pneumatic cylinders and bolts.

The ultimate aim is for humans to end up relying on collaborative robots because they have become an active participant in their daily tasks. In essence, the robot will know what to do, when to do it, and do it in a manner that a human can depend on.

The project is called SecondHands as it will literally provide a second pair of hands, and is part of the European Union’s Horizon 2020 Research and Innovation programme. We are leading the research, working with four other European institutions.

The tasks our robot will carry out will increase safety and efficiency, and require us to focus on key areas of robotics including:

Proactive assistance – the robot will have cognitive and perceptive ability to understand when and what help its operator needs, and then to provide it.

Artificial intelligence – to anticipate the needs of its operator and execute tasks without prompting, the robot will need to progressively acquire skills and knowledge.

3D perception – advanced 3D vision systems will allow the robot to estimate the 3D articulated pose of humans.

Humanoid form and flexibility – SecondHands will feature an active sensor head, two redundant torque controlled arms, two anthropomorphic hands, a bendable and extendable torso, and a wheeled mobile platform.

For more information, see the project’s website.

Dr Graham Deacon

Robotics Research Team Leader

UPDATE: If you think that sounds interesting, we’re looking for a talented Robotics Research Software Engineer to join the team. Take a look at the role now.

July 1st, 2015

Posted In: Blog

Tags: , , , , , , , ,

Scroll Up