FREE Receipt Images – OCR / Machine Learning Dataset

The ExpressExpense SRD (sample receipt dataset) consists of 200 images of restaurant receipts.  Each receipt is shown in entirety and includes business name, business address, cost, itemized items, subtotal, tax (if applicable), and total.  All receipt images are high-quality with dimensions larger than 600 pixels (longest side).

This sample receipt image dataset is ideal for software applications: OCR, image pre-processing, computer vision, machine learning, artificial intelligence. 

Larger receipt image datasets are available for purchase from ExpressExpense.  

This dataset is free for use under The MIT License (MIT).  If you use or reference this dataset, please cite our website  as the source: ExpressExpense.com

Download

File Size md5sum (for verification)
large-receipt-image-dataset-SRD.zip (19.2MB)

c8eb0f2d286da5ab742e7a5b59f15147

If you create an interesting project using our dataset, we would love to reference it here and link to your paper or article.  Please contact us from our homepage and let us know how you utilized this receipt image dataset.

For more information on synthetic receipt generation, receipt datasets and business services please visit ExpressExpense Business Services.

Making a new website or app?  Check out iconPRO.io Free Icon Maker! Make icons fast for your project.  iconPRO creates uniformly styled, professionally designed icons in seconds.  Best of all – it is FREE!

Practical Uses for Receipt Dataset

This dataset of 200 high-resolution scanned images of receipts can be useful for various purposes. Here are a few potential applications:

  • Receipt Recognition and Data Extraction: With a labeled dataset, you can train machine learning models to recognize and extract information from receipts automatically. This can include extracting details like vendor name, date, total amount, individual items, tax information, etc. Such models can streamline data entry processes and be used in expense management systems, accounting software, or for auditing purposes.
  • Expense Tracking and Management: The dataset can be used to develop applications that allow users to easily track and manage their expenses. By extracting relevant information from the receipts, users can automatically categorize expenses, create reports, and analyze spending patterns.
  • Fraud Detection: Receipts can be a source of fraudulent activities. By training models on a dataset of real receipts, you can build fraud detection systems that identify anomalies and flag suspicious transactions or receipts that deviate from regular patterns.
  • Consumer Research and Market Analysis: Analyzing a large dataset of receipts can provide valuable insights into consumer behavior and market trends. By aggregating and anonymizing the data, you can identify popular products, track purchasing patterns, measure the success of marketing campaigns, and make informed business decisions.
  • Personal Finance Tools: Receipt data can be leveraged to develop personal finance tools that help individuals manage their budgets, track expenses, and save money. By analyzing spending habits and providing personalized recommendations, these tools can assist users in making more informed financial decisions.
  • OCR (Optical Character Recognition) Training: Receipts often contain a mix of text and graphical elements. By using the dataset to train OCR models, you can improve their accuracy in recognizing characters within receipts, enabling better text extraction and analysis.
  • Digital Archive Creation: Scanned receipt images can be used to create digital archives for businesses or individuals. These archives provide a convenient and searchable way to store and retrieve receipts for future reference, accounting purposes, or warranty claims.

It’s important to note that the effectiveness of these applications heavily depends on the quality and diversity of the dataset. Therefore, ensuring a wide range of receipt types, vendors, and formats in the dataset will enhance its practical value.

If you require a larger dataset, please contact us !