TrOCR model fine-tuning [Transformer-based optical character recognition]

The TrOCR (Transformer-based Optical Character Recognition) model is one of the best performing OCR models. In our previous article we analyzed their performance on single lines of printed and handwritten text. However, like any other deep learning model, they have their limitations. TrOCR doesn’t perform well with curved text out of the box. This article […]

Onnx export swin transformer

1. Configure the swin transformer environment according to the repo. https://github.com/microsoft/Swin-Transformer 2. Create the file export.py in the repo directory. run `python export.py –eval –cfg configs/swin/swin_based_patch4_window7_224.yaml –resume ../weights/swin_tiny_patch4_window7_224.pth –data-path data/ –lock_rank 0` # ———————————————– ———- #SwinTransformer # Copyright (c) 2021 Microsoft # Licensed under The MIT License [see LICENSE for details] # Written by Ze […]

C# Onnx LSTR Transformer-based end-to-end real-time lane line detection

Table of Contents Effect Model information project code download Effect End-to-end real-time lane line detection Model information lstr_360x640.onnx Inputs ———————– name: input_rgb tensor: Float[1, 3, 360, 640] name: input_mask tensor: Float[1, 1, 360, 640] ————————————————– ————- Outputs ———————– name:pred_logits tensor:Float[1, 7, 2] name:pred_curves tensor:Float[1, 7, 8] name:foo_out_1 tensor:Float[1, 7, 2] name:foo_out_2 tensor:Float[1, 7, 8] name:weights […]

AI modeling and training practice based on HF transformers

We often use scikit-learn to model data for both supervised and unsupervised learning tasks. We are familiar with object-oriented design, such as starting a class and calling subfunctions from the class. However, when I personally use PyTorch, I find design patterns that are similar but not the same as scikit-learn. Recommended online tools: Three.js AI […]

Reprint: TransXNet: A new CNN-Transformer visual backbone that aggregates global and local information, with powerful performance!

Article address: https://arxiv.org/abs/2310.19380</code><code>Project address: https://github.com/LMMMEng/TransXNet 00 | Introduction Current situation: Recent research integrates convolutions into transformers to introduce inductive bias and improve generalization performance. (1) The static characteristics of traditional convolution make it unable to dynamically adapt to input changes, resulting in a representation difference between convolution and self-attention, because self-attention dynamically calculates the attention […]

Transformer-based decode target detection framework (modify DETR source code)

Tip: Transformer structure target detection decoder, including loss calculation, with source code attached Article directory Preface 1. Interpretation of main function code 1. Understanding the overall structure 2. Main function code interpretation 3. Source code link 2. Interpretation of decode module code 1. Interpretation of decoded TransformerDec module code 2. Interpretation of decoded TransformerDecoder module […]

[RNN+Encrypted Traffic A] ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for…

Article directory Introduction to the paper Summary Problems Paper contribution 1.ET-BERT 2. Experiment Summarize Paper content data set Readable citations Reference connection Introduction to the paper Original title: ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification Chinese title: ET-BERT: A datagram contextual representation method based on pre-trained transformers for encrypted traffic […]

transformers-Generation with LLMs

https://huggingface.co/docs/transformers/main/en/llm_tutorialhttps://huggingface.co/docs/transformers /main/en/llm_tutorialThe stopping condition is determined by the model, and the model should be able to learn when to output an end-of-sequence (EOS) flag. If this is not the case, generation stops when some predefined maximum length is reached. from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained( “mistralai/Mistral-7B-v0.1″, device_map=”auto”, load_in_4bit=True ) from transformers import AutoTokenizer tokenizer […]

Transformer model training structure analysis (to deepen understanding)

Some feelings about running the project: Many times, the execution process of the program of an overall deep learning project needs to be sorted out. Often multiple modules are nested layer by layer, and then the order of execution also jumps between multiple Python function modules. Sometimes behind a short line of code in a […]