Sosse is a powerful, open-source search engine and crawler you can host yourself. It's designed to index, archive, and search websites, with a special focus on modern, dynamic pages that rely heavily on JavaScript. By using browser-based crawling, it can capture content that simpler tools often miss, creating a comprehensive and searchable archive of web content for your research or data projects.
Its capabilities make it a versatile tool for a wide range of tasks:
services:
sosse:
image: biolds/sosse:1.14
container_name: sosse
restart: unless-stopped
ports:
- "8080:8080"
volumes:
- ./data:/app/data
environment:
- ADMIN_PASSWORD=${ADMIN_PASSWORD}
ADMIN_PASSWORD=your_super_secret_password
Auto-fetched about 24 hours ago
Auto-fetched about 24 hours ago