Introduction
Every developer wants to show off their recent work, but manually updating a "recent projects" list is tedious. We wanted a live feed of our GitHub commits directly on makeitexist.net.
The challenge? Fetching data from an external API (like GitHub's) live on every page load can slow down your site significantly. We needed a reliable, asynchronous solution.
We combined Django for the web app, Celery and Redis for background processing, and the GitHub REST API for the data. This ensures the user experience remains fast while the data syncs reliably in the background.
Phase 1: Setting up the Data Pipeline
We needed a place to store the GitHub data locally so our website could read from its own database instead of constantly hitting GitHub's servers.
Django Models
We created simple Django models (Repository and Commit) to structure the incoming JSON data in our database:
# github_feed/models.py
from django.db import models
class Repository(models.Model):
# ... fields for repo name, owner, URL ...
pass
class Commit(models.Model):
"""Stores individual commit information."""
sha = models.CharField(max_length=40, unique=True, primary_key=True)
repository = models.ForeignKey(Repository, on_delete=models.CASCADE, related_name='commits')
message = models.TextField()
author_name = models.CharField(max_length=100)
date = models.DateTimeField()
html_url = models.URLField()
class Meta:
# Default ordering: newest first, then by repo name
ordering = ['-date', 'repository__name']
Secure Authentication
We stored our GitHub Personal Access Token (PAT) securely in a .env file using django-environ to keep credentials out of our code:
# .env file
GITHUB_PAT=ghp_YourActualTokenHere12345
GITHUB_USERNAME=your_github_username
Phase 2: The Magic of Asynchronous Tasks (Celery & Redis)
The core challenge was fetching all commits from all repositories without blocking the website. We configured a Celery worker process that runs independently of the main Django server.
We wrote Python tasks in github_feed/tasks.py to handle the heavy lifting:
# github_feed/tasks.py snippet
@shared_task
def sync_all_github_data():
"""Main task to orchestrate syncing all repositories."""
# ... logic to fetch list of repositories from GitHub API ...
repositories_data = fetch_paginated_data(repos_url)
for repo_data in repositories_data:
# Queue up individual tasks for each repository
fetch_commits_for_repo.delay(repo_instance.repo_id, repo_data['commits_url'])
@shared_task
def fetch_commits_for_repo(repo_id, commits_url):
"""Fetches all commits for a specific repository instance."""
# ... logic to fetch paginated commits and save to Commit model ...
pass
Phase 3: Presentation and Organization
Once the data was in the database, we needed to display it clearly on our main landing page. We chose to group the data by Date and then Repository.
The View Logic
We used Python's defaultdict within our landing/views.py to organize the flat list of commits into the desired hierarchy:
# landing/views.py snippet for context processing
# Group commits by Date Object -> Repo Name -> List of Commits
grouped_commits = defaultdict(lambda: defaultdict(list))
# ... (processing loops and sorting logic) ...
context = { "commits_by_date_and_repo": sorted_commits_display_data }
The Template
We then used nested Django template loops ({% for %}) to render this organized data structure cleanly in landing/landing.html:
<!-- landing/landing.html snippet for the feed section -->
<div class="content-section github-feed-section">
<div class="container">
<!-- Loop 1: Group by Date -->
{% for date_obj, repos in commits_by_date_and_repo.items %}
<div class="date-group">
<h3>{{ date_obj|date:"l, M d, Y" }}</h3>
<!-- Loop 2: Group by Repository -->
{% for repo_name, commits in repos.items %}
<!-- ... (inner loops and display logic) ... -->
<div class="repo-group">
<h4>
Repository:
<a href="{{ commits.0.repository.html_url }}" target="_blank">{{ repo_name }}</a>
</h4>
<ul class="commit-list">
{% for commit in commits %}
<li class="commit-item">
<span class="commit-sha">
<code><a href="{{ commit.html_url }}" target="_blank">{{ commit.sha|slice:":7" }}</a></code>
</span>
<span class="commit-separator">—</span>
<span class="commit-message">
{{ commit.message|truncatechars:120 }}
</span>
</li>
{% endfor %}
</ul>
</div>
{% endfor %}
</div>
<hr>
{% endfor %}
</div>
</div>
Conclusion
By decoupling our frontend display from the backend data fetching using Celery tasks, we achieved a responsive, real-time activity feed. The final configuration uses UTC time consistently across the server stack, ensuring clarity and robustness.