init
This commit is contained in:
147
README.MD
Normal file
147
README.MD
Normal file
@@ -0,0 +1,147 @@
|
||||
# WeChat OCR API Docker
|
||||
|
||||
A Dockerized REST API service for text recognition using WeChat's OCR engine.
|
||||
|
||||
## Overview
|
||||
|
||||
This project wraps the WeChat OCR functionality from the excellent [wechat-ocr](https://github.com/swigger/wechat-ocr) project into a simple REST API service that can be easily deployed using Docker. It allows you to perform optical character recognition on images by leveraging WeChat's powerful OCR capabilities.
|
||||
|
||||
## Acknowledgements
|
||||
|
||||
This project would not be possible without the work of [swigger](https://github.com/swigger) and their [wechat-ocr](https://github.com/swigger/wechat-ocr) project. Their efforts in reverse-engineering and creating a usable interface for WeChat's OCR functionality form the foundation of this service.
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Using Docker
|
||||
|
||||
```bash
|
||||
# Pull the image
|
||||
docker pull golangboyme/wxocr
|
||||
|
||||
# Run the container
|
||||
docker run -d -p 5000:5000 --name wechat-ocr-api golangboyme/wxocr
|
||||
```
|
||||
|
||||
### API Usage
|
||||
|
||||
Send a POST request to `/ocr` with a JSON payload containing your base64-encoded image:
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:5000/ocr \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"image": "BASE64_ENCODED_IMAGE_DATA"}'
|
||||
```
|
||||
|
||||
#### Example Response
|
||||
|
||||
```json
|
||||
{
|
||||
"errcode": 0,
|
||||
"height": 72,
|
||||
"width": 410,
|
||||
"imgpath": "temp/5726fe7b-25d6-43a6-a50d-35b5f668fbb6.png",
|
||||
"ocr_response": [
|
||||
{
|
||||
"text": "aacss",
|
||||
"left": 80.63632202148438,
|
||||
"top": 29.634929656982422,
|
||||
"right": 236.47093200683594,
|
||||
"bottom": 55.28932189941406,
|
||||
"rate": 0.9997046589851379
|
||||
},
|
||||
{
|
||||
"text": "xxzsa",
|
||||
"left": 312.625,
|
||||
"top": 30.75,
|
||||
"right": 395.265625,
|
||||
"bottom": 55.09375,
|
||||
"rate": 0.997739315032959
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Python Client Example
|
||||
|
||||
Here's a simple Python client to use the OCR API:
|
||||
|
||||
```python
|
||||
import requests
|
||||
import base64
|
||||
import os
|
||||
|
||||
def ocr_recognize(image_path=None, image_url=None, api_url="http://localhost:5000/ocr"):
|
||||
"""
|
||||
Send an image to the OCR API service and get the recognition results.
|
||||
Use either image_path or image_url (one is required).
|
||||
"""
|
||||
# Get image data
|
||||
if image_path:
|
||||
if not os.path.exists(image_path):
|
||||
print(f"Error: Local image not found: {image_path}")
|
||||
return
|
||||
with open(image_path, "rb") as image_file:
|
||||
img_data = image_file.read()
|
||||
elif image_url:
|
||||
try:
|
||||
response = requests.get(image_url)
|
||||
response.raise_for_status()
|
||||
img_data = response.content
|
||||
except Exception as e:
|
||||
print(f"Failed to download image: {str(e)}")
|
||||
return
|
||||
else:
|
||||
print("Please provide either image_path or image_url")
|
||||
return
|
||||
|
||||
# Convert image to base64
|
||||
base64_image = base64.b64encode(img_data).decode('utf-8')
|
||||
|
||||
# Send request to API
|
||||
try:
|
||||
response = requests.post(api_url, json={"image": base64_image})
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
except Exception as e:
|
||||
print(f"API request failed: {str(e)}")
|
||||
return None
|
||||
|
||||
# Example usage
|
||||
if __name__ == "__main__":
|
||||
# Local image example
|
||||
result = ocr_recognize(image_path="ocrtest.png")
|
||||
if result:
|
||||
print(result)
|
||||
|
||||
# URL image example (uncomment to use)
|
||||
# result = ocr_recognize(image_url="https://example.com/image.png")
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
- `main.py`: The Flask API service that handles OCR requests
|
||||
- `opt/wechat/wxocr`: WeChat OCR binary
|
||||
- `opt/wechat/`: WeChat runtime dependencies
|
||||
|
||||
## Technical Details
|
||||
|
||||
This service uses a Flask application to provide a REST API interface to the WeChat OCR functionality. When an image is submitted:
|
||||
|
||||
1. The base64-encoded image is decoded
|
||||
2. A temporary file is created
|
||||
3. The image is processed by the WeChat OCR engine via the wcocr Python binding
|
||||
4. Results are returned in JSON format
|
||||
5. Temporary files are cleaned up
|
||||
|
||||
## Limitations
|
||||
|
||||
- Currently only supports PNG images (can be extended if needed)
|
||||
- Depends on WeChat's OCR binaries which may be updated by WeChat
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under the MIT License - see the LICENSE file for details.
|
||||
|
||||
## Contributing
|
||||
|
||||
Contributions are welcome! Please feel free to submit a Pull Request.
|
||||
Reference in New Issue
Block a user