Amazon Rekognition is a machine learning (ML) based image and vision analysis service that can identify objects, people, text, scenes, and activities in images and videos, and detect any inappropriate content. Amazon Rekognition text detection enables you to recognize and extract textual content from images and videos. For example, in image sharing and social media apps, you can use text in images to perform visual searches based on an index of images that contain the same keywords. In media and entertainment applications, you can catalog videos based on the text on the screen, such as ads, news, sport scores, and captions.
The following screenshot shows an example of a text-in-image extraction.
In this post, we show how REA Group implemented an automated image compliance solution for their real estate listings by using the Amazon Rekognition Text in Image feature via the DetectText API.
About REA Group
REA Group a multinational digital advertising company specializing in property and real estate. The company has been in the market for over 20 years with businesses in Australia, Malaysia, Hong Kong, Thailand, Indonesia, Singapore, and China. REA group business in Asia includes leading portal brands like iproperty.com.my, squarefoot.com.hk, thinkofliving.com, and a significant stake in 99 Group in Singapore and Indonesia. REA Group also has significant shareholdings in Move, Inc and PropTiger in India. They provide buying, selling, and renting services to their consumers, along with the latest property news, renovation tips, and lifestyle content. Millions of consumers visit REA Group websites daily.
Image compliance challenges
REA Group provides search-based portals that enable property sellers to upload images of properties on the market to deliver a wide, searchable selection to their consumers. The REA team discovered that images uploaded to the portal often weren’t compliant with their usage terms. Some images included trademarks or contact details of the sellers, which created lead attribution challenges. They set up a dedicated human-based team to manually review the images for unapproved content, but the large volume of daily uploads and the additional review process delayed the property listing time by several days.
Image compliance solution
The REA team developed an image compliance system that automatically detects any noncompliance and notifies sellers. Initially, they trained their own trademark and contact detail detection ML models on Amazon Elastic Compute Cloud (Amazon EC2). However, they were seeing many false positives with their models, specifically with contact detail detection. They needed to do more work to increase their model’s accuracy, which involved extensive effort in model training and optimization. To meet their project goals and timelines, the team needed a solution that was simple to implement and could deliver the accuracy the business was looking for.
With that goal in mind, they decided to augment their existing ML models and review their workflow with Amazon Rekognition Text in Image to increase the accuracy of detecting noncompliance and reduce false positives. They added business rules that factored in a variety of predictions from their own models and from Amazon Rekognition to enable automated decision-making.
To further optimize their inference infrastructure operating cost, the REA team adopted an event-driven architecture using AWS Lambda to host the trademark and contact detail model inference engine. This approach not only improved their infrastructure resource usage efficiency, but also saved costs while meeting their business objectives.
How it works
The solution is built on a serverless stack, as shown in the following architecture, with Amazon API Gateway fronting the Image Upload API onto Amazon Simple Storage Service (Amazon S3), which triggers an event-driven workflow with Lambda running a series of ML models and business rules for automated decision-making.
The event-driven workflow is as follows:
- A seller submits a property listing with images to the portal via API Gateway.
- The image uploads to Amazon S3, which triggers an Amazon S3 event.
- The event, containing the Amazon S3 object’s associated metadata, is published to a distributed queue, Amazon Simple Queue Service (Amazon SQS).
- Using the Amazon SQS and Lambda integration, Lambda polls the queue until new events are available, which invokes a Lambda function. Lambda automatically invokes more functions to support any increase in events published to Amazon SQS.
- When a function is invoked, the image review business logic within the function is executed, with the trademark and contact detail models along with Amazon Rekognition inferred to detect noncompliance.
- The model’s outputs are combined for further processing against business rules to decide next steps—notify agent, route to reviewer team for inspection, or auto-approval.
“As our presence grew, the need to become more efficient became an important factor for us to scale and the team began to brainstorm how to serve our customers better, while maintaining a lean team,” says Mohammad Alauddin, Head of Data Science and Engineering. “Using machine learning on AWS via AWS Lambda and Amazon Rekognition, we increased the number of compliant, high-quality listings on our platform while reducing listing times and costs. Moreover, not only were we able to complete the project within planned timelines, we were also able to reduce the number of false positives by more than 56 percent.”
You can test Amazon Rekognition text detection on images specific to your business on the Amazon Rekognition console. For more information about Amazon Rekognition text detection APIs, see Amazon Rekognition Documentation.
About the Author
Fabian Tan is a Senior Solutions Architect at Amazon Web Services. He has a strong passion for databases, data analytics and machine learning and works closely with the Malaysian developer community to help them innovate. In his spare time, he enjoys the camping in the outdoors with his family, reading and playing sports.