Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
.gitignore merge=ours
README.md merge=ours
docker-compose.yml merge=ours
backend/** merge=ours
File renamed without changes.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,10 @@ tempCodeRunnerFile.py
unit_test.py
testing_workflow.py
*.yaml
local.settings.json
playwright_browser
__pycache__
docker-compose.yml

scripts/
playwright_browser
Expand Down
74 changes: 61 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
# CiteMe - Automatic Citation Generation System

CiteMe is a modern, full-stack application designed to help students generate references and in-line citations and references efficiently. The system provides intelligent citation suggestions, reference management, and seamless integration with academic databases.
CiteMe is a modern, full-stack application designed to help students generate references and in-line citations efficiently. The system provides intelligent citation suggestions, reference management, and seamless integration with academic databases.

Students do not have to worry about searching for sources to back essays and thesis. This web app will search the web , format your document with intext citation and include the references, sources and metrics to grade the credibility of the sources.
Students do not have to worry about searching for sources to back essays and thesis. This web app will search the web, format your document with intext citation and include the references, sources and metrics to grade the credibility of the sources.

The webapp also offers the choice of providing your own sources, in forms of urls, texts and pdfs and is able to use these sources to format your essays/thesis with intext citation and references in any citation format.


🌐 **Live Demo**: [CiteMe Editor](https://cite-me-wpre.vercel.app/editor)

## 🚀 Features
Expand All @@ -17,6 +16,10 @@ The webapp also offers the choice of providing your own sources, in forms of url
- **Real-time Metrics**: Track citation impact and academic metrics
- **Modern UI**: Responsive and intuitive user interface
- **API Integration**: Seamless integration with academic databases and search engines
- **Web Scraping**: Intelligent web scraping with Playwright for source extraction
- **Vector Search**: Efficient document retrieval using Pinecone vector database
- **AI-Powered**: Integration with multiple AI models (Azure, Groq, Gemini) for citation generation
- **Credibility Scoring**: Automated source credibility assessment

## 📁 Project Structure

Expand All @@ -29,10 +32,13 @@ CiteMe/
│ └── dist/ # Production build
├── backend/
│ ├── mainService/ # Core citation service
│ └── metricsService/ # Analytics and metrics service
├── .github/ # GitHub workflows and templates
├── docker-compose.yml # Docker services configuration
└── README.md # Project documentation
│ │ ├── src/ # Source code
│ │ ├── scripts/ # Utility scripts
│ │ └── config/ # Configuration files
│ └── metricsService/ # Analytics and metrics service
├── .github/ # GitHub workflows and templates
├── docker-compose.yml # Docker services configuration
└── README.md # Project documentation
```

## 🏗️ Architecture
Expand All @@ -41,7 +47,14 @@ The application is built using a microservices architecture with three main comp

1. **Frontend Service**: Vue.js 3 application hosted on Vercel
2. **Main Service**: FastAPI-based backend service handling core citation functionality
- Web scraping with Playwright
- Vector search with Pinecone
- AI model integration (Azure, Groq, Gemini)
- Citation generation and formatting
3. **Metrics Service**: FastAPI-based service for handling academic metrics and analytics
- Source credibility assessment
- Citation impact analysis
- Academic metrics tracking

## 🛠️ Tech Stack

Expand All @@ -56,12 +69,34 @@ The application is built using a microservices architecture with three main comp
### Backend
- Python 3.11
- FastAPI
- Pinecone
- Gemini
- Pinecone (Vector Database)
- Gemini (Google AI)
- Groq
- Azure hosted LLMs
- Mixbread (Reranking)
- LangChain
- Playwright (Web Scraping)
- Various AI/ML libraries

## 🔑 Environment Setup

Before running the services, you'll need to set up the following API keys:

1. Google API Keys:
- `CX`: Google Programmable Search Engine ID
- `GPSE_API_KEY`: Google Programmable Search Engine API key
- `GOOGLE_API_KEY`: Gemini API key

2. AI Service Keys:
- `GROQ_API_KEY`: Groq API key
- `PINECONE_API_KEY`: Pinecone vector database
- `MIXBREAD_API_KEY`: Mixbread reranking service
- `AZURE_MODELS_ENDPOINT`: Azure endpoint for citation generation

3. Optional Services:
- `CREDIBILITY_API_URL`: URL for the credibility metrics service
- `SERVERLESS`: Set to TRUE for serverless mode

## 🚀 Getting Started

### Prerequisites
Expand All @@ -78,9 +113,10 @@ git clone https://github.com/yourusername/citeme.git
cd citeme
```

2. Create `.env` files in both service directories:
- `backend/mainService/.env`
- `backend/metricsService/.env`
2. Create a `.env` file in the root directory with all required API keys:
```bash
cp backend/mainService/.env.example .env
```

3. Build and run the services using Docker Compose:
```bash
Expand Down Expand Up @@ -174,11 +210,23 @@ cd ../metricsService
pytest
```

## 🔄 CI/CD Pipeline

The project uses GitHub Actions for continuous integration and deployment:

- **Automated Testing**: Runs on every push to main and pull requests
- **Python 3.11**: Uses the latest Python 3.11 environment
- **Test Dependencies**: Installs both main and test requirements
- **PR Management**: Automatically closes failed PRs with explanatory comments
- **Environment Variables**: Securely manages API keys and configuration

The pipeline can be found in `.github/workflows/python-ci-cd.yml`.

## 📦 Docker Images

The backend services have their own Dockerfiles:

- `backend/mainService/Dockerfile`: Python-based main service
- `backend/mainService/Dockerfile`: Python-based main service with Playwright support
- `backend/metricsService/Dockerfile`: Python-based metrics service

## 🤝 Contributing
Expand Down
15 changes: 8 additions & 7 deletions backend/mainService/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,13 @@ FROM python:3.11-slim

WORKDIR /app

# Install system dependencies
# Install system dependencies including Playwright requirements
# Installs essential tools for compiling software from source, often needed for Python package dependencies.(build-essential)
# Removes the package lists downloaded during the update to reduce the image size.
RUN apt-get update && apt-get install -y \
build-essential \
cron \
wget \
&& rm -rf /var/lib/apt/lists/*

# Set the PATH environment variable to include /app
Expand All @@ -19,18 +20,18 @@ COPY requirements.txt .
# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Install Playwright and its dependencies
RUN playwright install && playwright install-deps

# Create necessary directories
RUN mkdir -p /app/config /tmp/downloads

# Copy the source code
COPY ./scripts/ /app/scripts/
COPY ./src/ /app/src/
COPY ./app.py /app/app.py
COPY ./__init__.py /app/__init__.py

# Create a directory for runtime configuration
RUN mkdir -p /app/config

# Install playwright
RUN playwright install && playwright install-deps

# Expose the port the app runs on
EXPOSE 8000

Expand Down
18 changes: 5 additions & 13 deletions backend/metricsService/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,28 +3,20 @@ FROM python:3.11-slim
WORKDIR /app

# Install system dependencies
# Installs essential tools for compiling software from source, often needed for Python package dependencies.(build-essential)
# Removes the package lists downloaded during the update to reduce the image size.
RUN apt-get update && apt-get install -y \
build-essential \
&& rm -rf /var/lib/apt/lists/*

# Set the PATH environment variable to include /app
ENV PATH="/app:${PATH}"

# Copy requirements first to leverage Docker cache
# Copy requirements first
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the application
# Copy the application
COPY ./src/ /app/src/

RUN cd /app/src
# Create necessary directories
RUN mkdir -p /app/config

# Expose the port the app runs on
EXPOSE 8000

# Command to run the application
CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]
Binary file removed backend/metricsService/README.md
Binary file not shown.
36 changes: 0 additions & 36 deletions docker-compose.yml

This file was deleted.

2 changes: 1 addition & 1 deletion frontend/src/components/MainPageHeader.vue
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ const toggleView = () => {
type="text"
placeholder="Untitled"
required
maxlength="50"
maxlength="150"
/>
</div>

Expand Down
7 changes: 7 additions & 0 deletions frontend/vercel.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"rewrites": [
{ "source": "/editor", "destination": "/index.html" },
{ "source": "/preview", "destination": "/index.html" }
]
}