Instructions to use KennethTM/pix2struct-base-table2html with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use KennethTM/pix2struct-base-table2html with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="KennethTM/pix2struct-base-table2html")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("KennethTM/pix2struct-base-table2html") model = AutoModelForImageTextToText.from_pretrained("KennethTM/pix2struct-base-table2html") - Notebooks
- Google Colab
- Kaggle
| library_name: transformers | |
| license: mit | |
| datasets: | |
| - SpursgoZmy/MMTab | |
| - apoidea/pubtabnet-html | |
| language: | |
| - en | |
| base_model: google/pix2struct-base | |
| pipeline_tag: image-to-text | |
| # pix2struct-base-table2html | |
| *Turn table images into HTML!* | |
| ## Demo app | |
| Try the [demo app](https://huggingface.co/spaces/KennethTM/Table2html-table-detection-and-recognition) which contains both table detection and recognition! | |
| ## About | |
| This model takes an image of a table and outputs HTML - the model parses the image and performs optical character recognition (OCR) and structure recognition to HTML format. | |
| The model expects an image containing only a table. If the table is embedded in a document, first use a table detection model to extract it (e.g. [Microsoft's Table Transformer model](https://huggingface.co/microsoft/table-transformer-detection)). | |
| The model is finetuned from [Pix2Struct base model](https://huggingface.co/google/pix2struct-base) using a max_patch_length of 1024 and max generation length of 1024. The max_patch_length should likely not be changed for inference but the generation length can be changed. | |
| The model has been trained using two datasets: [MMTab](https://huggingface.co/datasets/SpursgoZmy/MMTab) and [PubTabNet](https://huggingface.co/datasets/apoidea/pubtabnet-html). | |
| ## Usage | |
| Below is a complete example of loading the model and performing inference on an example table image (example from the [MMTab dataset](https://huggingface.co/datasets/SpursgoZmy/MMTab)): | |
| ```python | |
| import torch | |
| from transformers import AutoProcessor, Pix2StructForConditionalGeneration | |
| from PIL import Image | |
| import requests | |
| from io import BytesIO | |
| # Load model and processor | |
| device = "cuda" if torch.cuda.is_available() else "cpu" | |
| processor = AutoProcessor.from_pretrained("KennethTM/pix2struct-base-table2html") | |
| model = Pix2StructForConditionalGeneration.from_pretrained("KennethTM/pix2struct-base-table2html") | |
| model.to(device) | |
| model.eval() | |
| # Load example image from URL | |
| url = "https://huggingface.co/KennethTM/pix2struct-base-table2html/resolve/main/example_recog_1.jpg" | |
| response = requests.get(url) | |
| image = Image.open(BytesIO(response.content)) | |
| # Run model inference | |
| encoding = processor(image, return_tensors="pt", max_patches=1024) | |
| with torch.inference_mode(): | |
| flattened_patches = encoding.pop("flattened_patches").to(device) | |
| attention_mask = encoding.pop("attention_mask").to(device) | |
| predictions = model.generate(flattened_patches=flattened_patches, attention_mask=attention_mask, max_new_tokens=1024) | |
| predictions_decoded = processor.tokenizer.batch_decode(predictions, skip_special_tokens=True) | |
| # Show predictions as text | |
| print(predictions_decoded[0]) | |
| ``` | |
| Example image: | |
|  | |
| Model HTML output for example image: | |
| ```html | |
| <table border="1" cellspacing="0"> | |
| <tr> | |
| <th> | |
| Rank | |
| </th> | |
| <th> | |
| Lane | |
| </th> | |
| <th> | |
| Name | |
| </th> | |
| <th> | |
| Nationality | |
| </th> | |
| <th> | |
| Time | |
| </th> | |
| <th> | |
| Notes | |
| </th> | |
| </tr> | |
| <tr> | |
| <td> | |
| </td> | |
| <td> | |
| 4 | |
| </td> | |
| <td> | |
| Michael Phelps | |
| </td> | |
| <td> | |
| United States | |
| </td> | |
| <td> | |
| 51.25 | |
| </td> | |
| <td> | |
| OR | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| </td> | |
| <td> | |
| 3 | |
| </td> | |
| <td> | |
| Ian Crocker | |
| </td> | |
| <td> | |
| United States | |
| </td> | |
| <td> | |
| 51.29 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| </td> | |
| <td> | |
| 5 | |
| </td> | |
| <td> | |
| Andriy Serdinov | |
| </td> | |
| <td> | |
| Ukraine | |
| </td> | |
| <td> | |
| 51.36 | |
| </td> | |
| <td> | |
| EU | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| 4 | |
| </td> | |
| <td> | |
| 1 | |
| </td> | |
| <td> | |
| Thomas Rupprath | |
| </td> | |
| <td> | |
| Germany | |
| </td> | |
| <td> | |
| 52.27 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| 5 | |
| </td> | |
| <td> | |
| 6 | |
| </td> | |
| <td> | |
| Igor Marchenko | |
| </td> | |
| <td> | |
| Russia | |
| </td> | |
| <td> | |
| 52.32 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| 6 | |
| </td> | |
| <td> | |
| 2 | |
| </td> | |
| <td> | |
| Gabriel Mangabeira | |
| </td> | |
| <td> | |
| Brazil | |
| </td> | |
| <td> | |
| 52.34 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| 7 | |
| </td> | |
| <td> | |
| 8 | |
| </td> | |
| <td> | |
| Duje Draganja | |
| </td> | |
| <td> | |
| Croatia | |
| </td> | |
| <td> | |
| 52.46 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| 8 | |
| </td> | |
| <td> | |
| 7 | |
| </td> | |
| <td> | |
| Geoff Huegill | |
| </td> | |
| <td> | |
| Australia | |
| </td> | |
| <td> | |
| 52.56 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| </table> | |
| ``` | |
| And the rendered HTML table: | |
| <table border="1" cellspacing="0"> | |
| <tr> | |
| <th> | |
| Rank | |
| </th> | |
| <th> | |
| Lane | |
| </th> | |
| <th> | |
| Name | |
| </th> | |
| <th> | |
| Nationality | |
| </th> | |
| <th> | |
| Time | |
| </th> | |
| <th> | |
| Notes | |
| </th> | |
| </tr> | |
| <tr> | |
| <td> | |
| </td> | |
| <td> | |
| 4 | |
| </td> | |
| <td> | |
| Michael Phelps | |
| </td> | |
| <td> | |
| United States | |
| </td> | |
| <td> | |
| 51.25 | |
| </td> | |
| <td> | |
| OR | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| </td> | |
| <td> | |
| 3 | |
| </td> | |
| <td> | |
| Ian Crocker | |
| </td> | |
| <td> | |
| United States | |
| </td> | |
| <td> | |
| 51.29 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| </td> | |
| <td> | |
| 5 | |
| </td> | |
| <td> | |
| Andriy Serdinov | |
| </td> | |
| <td> | |
| Ukraine | |
| </td> | |
| <td> | |
| 51.36 | |
| </td> | |
| <td> | |
| EU | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| 4 | |
| </td> | |
| <td> | |
| 1 | |
| </td> | |
| <td> | |
| Thomas Rupprath | |
| </td> | |
| <td> | |
| Germany | |
| </td> | |
| <td> | |
| 52.27 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| 5 | |
| </td> | |
| <td> | |
| 6 | |
| </td> | |
| <td> | |
| Igor Marchenko | |
| </td> | |
| <td> | |
| Russia | |
| </td> | |
| <td> | |
| 52.32 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| 6 | |
| </td> | |
| <td> | |
| 2 | |
| </td> | |
| <td> | |
| Gabriel Mangabeira | |
| </td> | |
| <td> | |
| Brazil | |
| </td> | |
| <td> | |
| 52.34 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| 7 | |
| </td> | |
| <td> | |
| 8 | |
| </td> | |
| <td> | |
| Duje Draganja | |
| </td> | |
| <td> | |
| Croatia | |
| </td> | |
| <td> | |
| 52.46 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| 8 | |
| </td> | |
| <td> | |
| 7 | |
| </td> | |
| <td> | |
| Geoff Huegill | |
| </td> | |
| <td> | |
| Australia | |
| </td> | |
| <td> | |
| 52.56 | |
| </td> | |
| <td> | |
| </td> | |
| </tr> | |
| </table> |