AWS

Welcome to the Ocado Technology Webinars, where you can hear from the people building the ground-breaking, game-changing technology that powers Ocado, the world’s largest online-only grocery retailer.

In this webinar Javi Carretero, mobile web services team leader at Ocado Technology, offers an overview of AWS Lambda and serverless computing, including a code demonstration showing how easy it is to get started with cloud-based web services.

Key takeaways

  • Serverless architectures are a hot topic in the cloud computing space
  • Before building a new application or service in AWS, ask yourself what influences your production environment
  • There are many considerations that define a production environment, but perhaps the most important ones are performance, security, availability and the deployment strategy
  • Serverless architectures attempt to automatically solve the cost issues affecting traditional provisioning by offloading the task of container lifecycle management to the service provider
  • Because serverless architectures are entirely managed by service providers, developers should not make assumptions about the state of a container before a function is executed
  • Serverless architectures offer developers many advantages, including fully managed scaling, almost-zero system administration, and the ability to focus mainly on function code
  • Developers also need to be aware that serverless architectures are a new concept and therefore there are several potential drawbacks related to maturity and start-up latency


00:46: An overview of the factors deciding the structure of your AWS production environment.

02:41: Traditional provisioning that scales up and down according to throughput can easily be affected by inefficient allocation of resources.

04:05: Serverless architectures can be seen as a FaaS (Function as a Service).

05:01: In serverless architectures, developers don’t have to worry about provisioning; the containers are fully managed by the service provider.

05:34: Serverless containers might not run 24/7: the service provider can choose to shut them down or start new ones automatically as events are flowing through.

06:00: Serverless containers are alsoephemeral: developers don’t have a guarantee that the function will always run on the same container.

06:24: The provisioning profile of a serverless architecture.

07:00: A coding example showing how developers can create a function in AWS; the utilities available for working with functions; testing within the console; and creating a production application using AWS Lambda.

07:35: How to create and configure an AWS Lambda function.

13:05: An overview of two AWS Lambda example functions that can be used for e-commerce purposes e.g. generating analytics based on customer checkouts.

20:35: Running the checkout stream application and observing the data flow for the example functions.

24:53: Covering the advantages and disadvantages of using serverless architectures for cloud computing, including scaling capabilities, administration costs, stateless services, and startup latencies. (Note: When it comes to the startup latency, our initial testing revealed AWS Lambda cold starts can be compared to EC2 scaling. Since Amazon keeps warm containers ready, this should not have a big impact on your application).

28:20: Final thoughts on AWS Lambda.

More about our webinars

You can keep up to date with the webinars by subscribing to our YouTube channel. This article provides clickable links that take you directly to the highlighted part of the video clip.

April 27th, 2017

Posted In: Blog

Tags: , , , ,

Cloud image

Being the world’s largest online-only supermarket means Ocado eats big data for breakfast. Since its inception more than three years ago, the data team at Ocado Technology has been finding ever more efficient ways to manage Ocado’s digital footprint.

One way to achieve this goal was to be at the forefront of adopting cloud technologies. This article aims to offer a brief overview of how the data team tackled a major project to move all of Ocado’s on-premise data to the cloud. There have been several important lessons we’ve learned along the way and I’d like to use this opportunity to share a few of them with you.

 

'Growth in data' diagram

The main motivation for starting this project was threefold:

  • Reducing costs: the old, on-premise stack was expensive to upgrade and maintain
  • Gaining more performance: we were hoping to achieve more elastic scaling based on demand
  • Data centralisation: we wanted to remove siloing of data between different departments and business divisions.

The project was initially resourced using our own internal data team; we felt confident the team had the required skills to do an initial proof of concept. We then used a third party provider who adopted a rinse and repeat approach based on our work.

From the start, we had a clear idea of when we could declare the project completed: all data from our on-prem analytics databases had to be migrated into the cloud into Google Cloud Storage or, ideally, BigQuery. This target would allow us to further exploit technologies like DataProc or TensorFlow on Google Cloud Machine Learning. Throughout the migration project, we could also easily quantify the benefit this move to the cloud was bringing as the cost of work (the humans and the system) was very obvious.

'BigQuery performance' diagram

We found there was no need to involve other parts of the business initially, and treated the project as a fixed-scope piece of work. However, as it evolved, we reevaluated the possibility of getting other teams involved so we could have a more inclusive, business-wide approach once the technology was well understood.

The ultimate desire was to move this project into the product stream to support the parallel streaming of data into the cloud. The prioritisation of these streams was handled by a product owner who also engaged with a steering group that took into account the current business needs.

We also set up a data curation team that would help business owners classify their data and land it in appropriate storage areas with correct access levels/retention, especially with Privacy Shield and GDPR. The data curation team also worked with the other teams to define the meaning of the data and create a set of business definitions.

Moving data around is not difficult, but assuring its quality is. How could we convince our stakeholders that the data in the cloud was indeed the same as that which they trusted on-premise? When it came to the quality of data, we implemented QA in several ways:

  • We validated that the source database and the cloud were in alignment.
  • Only certain data stores were classified as clean and assured
  • Data sources were prepared in Tableau to expose clean data
  • Those sources were validated with business users as they landed so that issues could be identified

At the end of the project, we were able to develop a series of processes that were production ready and supported through our technology teams.

Since adopting the Google Cloud Platform, we’ve reduced storage costs to a tenth, increased our storage capacity over twenty times and improved performance by hundreds of times compared to our previous approach of hosting data on-premise. Furthermore our development cycles on the data in the cloud has been significantly reduced as we implemented on demand computation power which allows us to experiment and iterate with much less latency and friction. Our initial results show how a cloud-first strategy can really bring benefits to the business, and we look forward to working with other like-minded retailers through our cloud-based Ocado Smart Platform.

To learn more about how Ocado Technology adopted BigQuery and other Google Cloud services, please register for this webcast.

Dan Nelson, Head of Data

March 28th, 2017

Posted In: Blog

Tags: , , , , , , ,

Cloud image

One of the biggest projects under way in the Ocado Technology data team has been building our own solution for storing and processing extremely large sets of customer data in the cloud. An online-only supermarket, Ocado.com has a dedicated customer base of over 580,000 active users who visit our website several times a week and spend more than £100 per shop.

The nature of our business brings a level of complexity unlike anything else in the retail world; indeed, many have claimed that grocery is the holy grail of the retail market – and data analytics is playing a big role in customer retention.

Managing this sea of customer data is no easy task. Although we initially used Hadoop, the rapid growth of our customer base meant we had to find something that could scale quickly, and with minimal maintenance overhead, so we refocused on BigQuery and the Google Cloud Platform.

When we made the decision to adopt the cloud, we had several end goals in mind. Firstly, we wanted to improve the customer experience and streamline the feature design and testing of our Ocado.com webshop. Secondly, we realised that a cloud-first strategy would truly empower our business teams to have greater insight into our merchandising operations. Finally, we were looking to improve the responsiveness of the Ocado.com webshop by using some of the intrinsic advantages of the cloud (i.e. elastic performance and storage).

How we did it chart

The initial work was carried out by a small cross-functional team that included a product owner, software engineers, business analysts, and business users. Even though the product owner was part of Ocado Technology, he had a horizontal line into the retail division to ensure that the solution remained cost effective and business viable. The business user was expected to adopt the report to make the transition from initial proof of concept to production easier. Overall, the entire team needed to be agile as the technology was new and therefore the final solution could change quickly along the way.

We timeboxed and ran the first instance of the project in Kanban to deliver a proof of concept as quickly as possible and with minimal cost, hoping to illustrate the value of the project as soon as possible – and then quickly implement it at a larger scale.

Performance points

Switching from Hadoop to a managed service such as BigQuery part way through the project revealed a series of cost and performance improvements. No longer did we need to decide how many instances to bring up in the cluster, nor wait for it to instantiate, but we simply ran our queries and paid for the IO. We also didn’t have to perform maintenance on the servers or the Hadoop distribution. Google handled all of the uptime needs and backend patching. Most of all though we saw BigQuery out perform our Hadoop cluster by over 80 times on our largest dataset and for half the cost.

BiqQuery cost graph

It would have been completely cost ineffective to scale our cluster to meet the BigQuery performance level. A further side benefit was that the data in BigQuery was accessible in various other GCP services (e.g. DataProc) and so we could leverage that power in multiple ways but only have to store a single copy of the data.

At the beginning, we accepted that the data came with lower quality assurance and zero SLAs. Once the proof of concept proved valuable, the pipelines were productionised. This is where we discovered lots of learnings as actually the data teams weren’t best placed to do productise the pipelines, especially with respect to quality and governance. We then spent a lot of time figuring out who could optimally handle this task and found that the data producing teams were best placed to do it.

The next phase was to involve the data producing teams (in this case, the webshop) and have them build the pipelines using the technology platform we supported. This way, we made sure that producers can have direct control over the quality, timeliness and meaning of their data. Organisationally, it also allowed the new features being built by the webshop team at the request of the retail business to be considered in this pipeline, with the data department acting more as advisors.

To learn more about how Ocado Technology adopted BigQuery, please register for this webcast.

Video screenshot

Looking back, this project allowed the data team to gain valuable insight into the process of moving to the cloud and, more specifically, of using the Google Cloud Platform for customer analytics. With the Ocado Smart Platform, we’re looking forward to replicate this success story and roll it out to other grocery retailers as well as continuing to use it for ourselves.

Dan Nelson, Head of Data

March 10th, 2017

Posted In: Blog

Tags: , , , , , , ,

Alex Howard Whitaker

Welcome to the Ocado Technology Webinars, where you can hear from the people building the ground-breaking, game-changing technology that powers Ocado, the world’s largest online-only grocery retailer.

In this webinar Alex Howard Whitaker, cloud services engineer at Ocado Technology, talks about the challenges and opportunities involved in adopting a cloud-first strategy using Amazon AWS.

Key Takeaways

  • When Ocado Technology decided to adopt a public cloud solution, the main challenges revolved around automation, security and microservices deployment
  • Ocado Technology used an Agile approach to cloud adoption
  • Constantly building for failure helped mitigate security and performance issues
  • The cloud adoptions strategy was shaped by a set of well-defined best practices and guidelines
  • The growing list of technology stacks made it clear that a managed services-focused platform provided the best solution for the team’s specific needs
  • Creating an environment where the cloud infrastructure and cloud development teams worked together ultimately improved the overall platform and the tools around it
  • Ocado adopted Amazon AWS for operational services and the Google Cloud Platform for data analytical services
  • Having a separate AWS configuration for each customer made data segregation easier to manage
  • APIs are a great way to implement access control for applications

00:46: Starting from scratch means there are many choices and there is no right or wrong answer

01:08: Ocado Technology wanted to use AWS for existing systems and had to take an Agile approach to its implementation

01:39: Adopting a public cloud solution brought questions around security and performance

02:28: The development team looked at various cloud success stories, including Netflix

02:49: In order to ensure consistent adoption, the cloud teams created a set of best practices and guidelines

03:36: Ocado chose to use managed services offered by a service provider wherever possible to accelerate development without any downtime

04:08: The systems created to manage the cloud were also hosted in exactly the same way as the cloud applications themselves

04:34: The cloud teams evaluated both AWS and GCP and found the former better suited for front-end, operational services while the latter more focused on back-end, data analytical systems

06:08: Amazon AWS provides a myriad of services and there was a lot to learn about their individual characteristics

06:28: The first AWS implementation was relatively simple and straightforward

06:59: The second attempt joined the network hub account with a VPN back end

07:50: The third configuration aimed to decentralize various end points to improve access speed

08:44: The cloud team learned a lot from deploying live applications into the AWS configurations, particularly around data segregation, service limits and throttling

11:53: In the fourth version of the AWS implementation, the Ocado Technology team used the information gained from live deployment to create a more flexible configuration that could scale easily

12:19: The new architecture was based on microservices that used APIs to ensure abstraction of resources

13:00: Using access control and tagging to create better permissions for AWS applications

15:40: The architecture of our deployment process included an app registry, cloud provisioning, AMI build automation, cloud formation scripts and more

17:25: Using AWS Elastic Beanstalk, Ocado Technology deployed 250+ applications over a choice of stacks (Java, Python, NodeJS) and servers

17:52: Concluding remarks

More about our webinars

You can keep up to date with the webinars by subscribing to our YouTube channel. This article provides clickable links that take you directly to the highlighted part of the video clip.

January 9th, 2017

Posted In: Blog

Tags: , , , , , , ,

Contact Centre Agent

Being the world’s largest online-only grocery supermarket with over 500,000 active customers means we get the opportunity to interact with people all across the UK on a daily basis. Ocado prides itself on offering the best customer service in the industry which is one of the many reasons why our customers keep coming back.

Since Ocado doesn’t have physical stores, there are mainly two ways our customers and our employees interact directly. The first (and probably most common) is when our drivers deliver the groceries to the customers’ doorsteps; the second is when customers call or email us using our contact center based in the UK.

Today we’re going to tell you a bit more about how a customer contact center works and how Ocado is making it smarter.

The customer contact center

On the surface, Ocado operates the kind of contact center most people are already familiar with; we provide several ways for our customers to get in touch, including social media, a UK landline number, and a contact email.

Contact Centre

Customers can email, tweet or call Ocado

When it comes to emails, we get quite a variety of messages: from general feedback and redelivery requests to refund claims, payment or website issues – and even new product inquiries.

Getting in touch with a company can sometimes feel cumbersome. To make the whole process nice and easy for our customers, we don’t ask them to fill in any forms or self-categorise their emails. Instead, all messages gets delivered into a centralised mailbox no matter what they contain.

Contact Centre

Ocado customer service representatives filtering customer emails

However, a quick analysis of the classes of emails mentioned above reveals that not all of them should be treated with the same priority. In an old-fashioned contact centre, each email would be read and categorised by one of the customer service representatives and then passed on to the relevant department.

This model has a few major flaws: if the business starts scaling up quickly, customer service representatives may find it challenging to keep up, leading to longer delays which will anger customers. In addition, sifting through emails is a very repetitive task that often causes frustration for contact centre workers.

Clearly there must be a better way!

Machine learning to the rescue

Unbeknownst to many, Ocado has a technology division of 1000+ developers, engineers, researchers and scientists working hard to build an optimal technology infrastructure that revolutionises the way people shop online. This division is called Ocado Technology and includes a data science team that constantly finds new ways to apply machine learning and AI techniques to improve the processes related to running retail operations and beyond.

After analysing the latest research on the topic, the data science team discovered that machine learning algorithms can be adapted to help customer centres cope with vast amounts of emails.

The diagram below shows how we created our AI-based software application that helps our customer service team sort through the emails they receive daily.

Cloud computing model

The new AI-enhanced contact centre at Ocado

One of the fields related to machine learning is natural language processing (NLP), a discipline that combines computer science, artificial intelligence, and computational linguistics to create a link between computers and humans. Let’s use an email from a recent customer as an example to understand how we’ve deployed machine learning and NLP in our contact centres:

Example of feedback

The machine learning model identifies that the email contains general feedback and that the customer is happy

The software solution we’ve built parses through the body of the email and creates tags that help contact cenre workers determine the priority of each email. In our example, there is no immediate need for a representative to get in touch; the customer is satisfied with their order and has written a message thanking Ocado for their service.

We strive to deliver the best shopping experience for all our 500,000 + active customers. However, working in an omni channel contact centre can be challenging, with the team receiving thousands of contacts each day via telephone, email, webchat, social media and SMS. The new software developed by the Ocado Technology data science team will help the contact centre filter inbound customer contacts faster, enabling a quicker response to our customers which in turn will increase customer satisfaction levels. – Debbie Wilson, contact centre operations manager

In the case of a customer raising an issue about an order, the system detects that a representative needs to reply to the message urgently and therefore assigns the appropriate tag and colour code.

Data science at Ocado, using Google Cloud Platform and TensorFlow

This new ML-enhanced contact centre demonstrates how Ocado is using the latest technologies to make online shopping better for everyone.

Ocado was able to successfully deploy this new product in record time as a result of the close collaboration between three departments: data science, contact centre systems, and quality and development. Working together allowed us to share data and update models quickly, which we could then deploy in a real-world environment. Unlike a scientific demonstration where you’re usually working with a known set of quantities, the contact centre provided a much more dynamic scenario, with new data arriving constantly. – Pawel Domagala, product owner, last mile systems

Our in-house team of data scientists (check out our job openings here) trained the machine learning model on a large set of past emails. During the research phase, the team compared different architectures to find a suitable solution: convolutional neural networks (CNNs), long short term memory networks (LSTMs) and others. Once the software architecture was created, the model were then implemented using the TensorFlow library and the Python programming language.

TensorFlow and Python logos

Python is the de-facto most popular programming language in the data science community and provides the syntax simplicity and expressiveness capabilities we were looking for.

TensorFlow is a popular open-source machine learning toolkit that scales from research to production. TensorFlow is built around data flow graphs that can easily be constructed in Python, but the underlying computation is handled in C++ which makes it extremely fast.

We’re thrilled that TensorFlow helped Ocado adapt and extend state-of- the-art machine learning techniques to communicate more responsively with their customers. With a combination of open-source TensorFlow and Google Cloud services, Ocado and other leading companies can develop and deploy advanced machine learning solutions more rapidly than ever before. – Zak Stone, Product Manager for TensorFlow on the Google Brain Team

Understanding natural language is a particularly hard problem for computers. To overcome this obstacle, data scientists need access to large amount of computational resources and well-defined APIs for natural language processing. Thanks to the Google Cloud Platform, Ocado was able to use the power of cloud computing and train our models in parallel. Furthermore, Ocado has been an early adopter of Google Cloud Machine Learning (now available to all businesses in public beta) as well as the Cloud Natural Language API.

Google Cloud Platform logo

If you want to learn more about the technologies presented above, check out this presentation from Marcin Druzkowski, senior software engineer at Ocado Technology.

Make sure you also have a look at our Ocado Smart Platform for an overview of how Ocado is changing the game for online shopping and beyond.

October 13th, 2016

Posted In: Blog

Tags: , , , , , , , , ,

Scroll Up