Jump to content

A Cabinet of Brief Curiosities: Difference between revisions

From Yusupov's House
Created page with "{{Infobox | 01_name = A Cabinet of Brief Curiosities | 02_url = https://acbc.yusupov.cloud | 03_developer = Michel Vuijlsteke | 04_released = 2025 | 05_genre = AI-generated short-fiction application | 06_language = Python | 07_framework = Flask 3.0 | 08_license = MIT }} '''A Cabinet of Brief Curiosities''' (abbreviated '''acbc''') is a web application hosted at <code>acbc.yusupov.cloud</code> that generates illustrated thre..."
 
No edit summary
Line 10: Line 10:
}}
}}


'''A Cabinet of Brief Curiosities''' (abbreviated '''acbc''') is a web application hosted at <code>acbc.yusupov.cloud</code> that generates illustrated three-sentence short stories in the style of [[H. P. Lovecraft]]. Each story is composed by a large language model from a structured set of randomised "knobs" and optional user-supplied seed words, and is paired with a black-and-white illustration designed to resemble a 19th-century engraved book plate. The site's rotating tagline — drawn every thirty minutes from a pool of forty-five variants — frames the act of generation in deliberately archaic terms; the canonical opening reads "Dredge slivers of impossible worlds from the black gulfs of imagination and press them into trembling mortal words."
'''A Cabinet of Brief Curiosities''' (abbreviated '''acbc''') is a web application hosted at <code>acbc.yusupov.cloud</code> that generates illustrated three-sentence short stories in the style of [[H. P. Lovecraft]]. Each story is composed by a large language model from a structured set of randomised "knobs" and optional user-supplied seed words, and is paired with a black-and-white illustration produced by a generative image model and styled to resemble a 19th-century engraved book plate. The site's rotating tagline — drawn every thirty minutes from a pool of forty-five variants — frames the act of generation in deliberately archaic terms; the canonical opening reads "Dredge slivers of impossible worlds from the black gulfs of imagination and press them into trembling mortal words."


== Technology stack ==
== Technology stack ==


The application is built on Flask 3.0 with [[SQLite]] as its database, accessed through [[SQLAlchemy]] (Flask-SQLAlchemy 3.1).<ref name="requirements">requirements.txt in the project repository pins Flask 3.0.3, Flask-Login 0.6.3, Flask-SQLAlchemy 3.1.1, python-dotenv 1.0.1, openai ≥1.50, httpx 0.27.2, and Pillow ≥10.</ref> Authentication is handled by Flask-Login. It is deployed on an Ubuntu VPS behind [[Nginx]] with [[Gunicorn]] running over a Unix socket under a dedicated <code>django</code> system user, supervised by systemd. Additional dependencies include the [[OpenAI]] Python client for both text and image generation, [[Pillow (imaging library)|Pillow]] for image post-processing, [[python-dotenv]] for environment configuration, and [[httpx]] as the underlying HTTP transport. The front end is rendered from Jinja templates using a small Bootstrap-derived stylesheet (<code>static/style.css</code>) and is registered as an installable progressive web app via <code>static/manifest.json</code> and a service worker (<code>static/sw.js</code>).
The application is built on Flask 3.0 with [[SQLite]] as its database, accessed through [[SQLAlchemy]] (Flask-SQLAlchemy 3.1).<ref name="requirements">requirements.txt in the project repository pins Flask 3.0.3, Flask-Login 0.6.3, Flask-SQLAlchemy 3.1.1, Flask-WTF ≥1.2, python-dotenv 1.0.1, openai ≥1.50, httpx 0.27.2, and Pillow ≥10.</ref> Authentication is handled by Flask-Login and cross-site request forgery protection by Flask-WTF. It is deployed on an Ubuntu VPS behind [[Nginx]] with [[Gunicorn]] running over a Unix socket under a dedicated <code>django</code> system user, supervised by systemd. Additional dependencies include the [[OpenAI]] Python client for both text and image generation, [[Pillow (imaging library)|Pillow]] for image post-processing, [[python-dotenv]] for environment configuration, and [[httpx]] as the underlying HTTP transport. The front end is rendered from Jinja templates using a small Bootstrap-derived stylesheet (<code>static/style.css</code>) and is registered as an installable progressive web app via <code>static/manifest.json</code> and a service worker (<code>static/sw.js</code>).


== Data model ==
== Data model ==
Line 28: Line 28:
<code>Story</code> stores the generated short fiction:
<code>Story</code> stores the generated short fiction:


* ''id'' (primary key), an optional ''user_id'' foreign key (null for guest submissions), the three-sentence ''story_text'', the optional ''mood'', ''nouns'' and ''verbs'' seeds supplied by the requester, an optional ''image_path'' (relative to the static folder, e.g. <code>images/story_42.png</code>), the originating ''ip_address'' (indexed), and an indexed ''created_at'' timestamp.
* ''id'' (primary key), an optional ''user_id'' foreign key (null for guest submissions), the three-sentence ''story_text'', the optional ''mood'', ''nouns'' and ''verbs'' seeds supplied by the requester, an optional ''image_path'' (relative to the static folder, e.g. <code>images/story_42.png</code>), the originating ''ip_address'' (indexed), an indexed ''created_at'' timestamp, and a JSON-encoded ''knobs_json'' column that records the parameter set the model was given.
 
The schema is created on first run by SQLAlchemy. A small startup routine inspects the live table and issues an <code>ALTER TABLE</code> if a newer column (such as <code>knobs_json</code>) is missing, providing a lightweight migration path without an external migration framework. The same routine sets <code>PRAGMA journal_mode=WAL</code> on the SQLite database to allow concurrent reads while a background image-generation thread writes.


== Story generation pipeline ==
== Story generation pipeline ==
Line 61: Line 63:
|}
|}


The knobs object always carries the seeds and at most three additional dimensions. Three of the knobs — ''perspective'', ''structure'', and ''time'' — are selected ''deterministically'' from a SHA-256 hash of a salt composed of the current 30-minute time bucket, the requester's IP address, and the seed words; this guarantees that the same client requesting the same seeds inside the same half-hour window draws the same priority knobs. The previous story's knobs for that IP are read from the choices log and avoided where possible. ''Time'' is hard-pinned to <code>ambiguous</code> to suppress dated or era-specific references in the output. Lexicon and constraint values are sampled with the regular pseudo-random generator. Once all priority knobs are forced into the object, any non-priority extras are dropped at random until at most three non-seed knobs remain.
The knobs object always carries the seeds and at most two additional non-seed dimensions. Three of the knobs — ''perspective'', ''structure'', and ''time'' — are selected ''deterministically'' from a SHA-256 hash of a salt composed of the current 30-minute time bucket, the requester's IP address, and the seed words; this guarantees that the same client requesting the same seeds inside the same half-hour window draws the same priority knobs. The previous story's knobs for that IP are read from the <code>knobs_json</code> column (falling back to the JSONL choices log for older rows) and avoided where possible. ''Time'' is hard-pinned to <code>ambiguous</code> to suppress dated or era-specific references in the output. Lexicon and constraint values are sampled with the regular pseudo-random generator. Once all priority knobs are forced into the object, any non-priority extras are dropped at random until at most two non-seed knobs remain.


=== System rules ===
=== System rules ===


The system prompt instructs the model to produce a coherent short story of exactly three sentences in the style of H. P. Lovecraft, with no titles, lists, numbering, or blank lines. The tone must be grave, ominous, and unsettling, and never humorous. The model is forbidden to use the words "eldritch", "cyclopean", or "loathsome", and is forbidden to begin any line with one of six banned openings — "In the", "Beneath", "Under the", "Within the", "In the dim", and "In the shadow". Whimsical seeds are to be adapted with synonyms that preserve tone.
The system prompt is composed entirely of positive instructions: it asks for exactly three sentences in a grave Lovecraftian register, with no titles, lists, numbering, or blank lines, and embeds two short hand-written exemplar stories so the model can imitate length, cadence, and concreteness rather than reason in the negative. The application then validates the response against a separate banned-word list rather than naming forbidden terms in the prompt itself, in order to avoid the well-documented tendency of language models to reproduce items they are explicitly told to avoid.


=== Model selection ===
=== Model selection ===


The default text model is GPT-4o (configurable via <code>OPENAI_MODEL</code>) with a configurable fallback (<code>OPENAI_FALLBACK_MODEL</code>, also GPT-4o by default). A helper function inspects the installed OpenAI SDK at request time: if the configured model is in the <code>gpt-5</code> family but the SDK does not expose the Responses API, the call silently falls back to a chat-compatible model. When the Responses API is available it is preferred for all models, with the system rules supplied via the explicit <code>instructions</code> field. For <code>gpt-5</code> family models the call additionally injects a <code>reasoning</code> object (defaulting to ''effort: low'' but configurable via <code>OPENAI_REASONING_EFFORT</code>), drops the <code>temperature</code> and <code>top_p</code> parameters (which those models reject), and grants a higher <code>max_output_tokens</code> budget (1600 by default, raised to 2400 on retry) to absorb reasoning-token consumption. For non-reasoning models, ''temperature'' is sampled uniformly from [0.88, 1.05] and ''top_p'' is fixed at 0.92. Where the Responses API is unavailable, the call degrades to Chat Completions.
The default text model is GPT-4o (configurable via <code>OPENAI_MODEL</code>) with a configurable fallback (<code>OPENAI_FALLBACK_MODEL</code>, also GPT-4o by default). A helper function inspects the installed OpenAI SDK at request time: if the configured model is in the <code>gpt-5</code> family but the SDK does not expose the Responses API, the call silently falls back to a chat-compatible model. When the Responses API is available it is preferred for all models, with the system rules supplied via the explicit <code>instructions</code> field. For <code>gpt-5</code> family models the call additionally injects a <code>reasoning</code> object (defaulting to ''effort: low'' but configurable via <code>OPENAI_REASONING_EFFORT</code>), drops the <code>temperature</code> and <code>top_p</code> parameters (which those models reject), and grants a higher <code>max_output_tokens</code> budget (1600 by default, raised to 2400 on retry) to absorb reasoning-token consumption. For non-reasoning models, ''temperature'' is sampled uniformly from a narrow [0.78, 0.92] interval and ''top_p'' is fixed at 0.9, deliberately tighter than typical creative-writing defaults to keep tone consistent. Where the Responses API is unavailable, the call degrades to Chat Completions.


=== Validation and retry ===
=== Validation and retry ===


After the call returns, the output is validated by a lightweight checker that enforces:
After the call returns, the output is validated by a checker that enforces:


* Exactly three non-empty lines.
* Exactly three non-empty lines.
* No line beginning with any of the banned opening prefixes.
* No line beginning with any of a list of banned opening prefixes ("In the", "Beneath", "Under the", "Within the", and so on).
* Absence of the banned words.
* Absence of a curated list of overused or out-of-register words and phrases ("eldritch", "cyclopean", "loathsome", any inflection of ''tentacle'', and similar).
* One terminal punctuation mark per line (no missing terminator, no more than two strong delimiters).
* Absence of anachronisms unsuited to the desired ambiguous-historical tone (kilometres, kilograms, GPS coordinates, e-mail addresses, four-digit years, references to Wi-Fi).
* A per-sentence word count in the [8, 60] range.
* One terminal punctuation mark per line.
 
If the request supplied seeds, a second pass verifies that the noun appears in the text — with allowance for irregular plurals — and that the verb appears in any common inflection.


If the request supplied seeds, a second pass verifies that the noun appears in the text and that the verb (or a simple inflection thereof) appears as a whole word.
A separate near-duplicate check computes a 64-bit SimHash of the candidate text and rejects it if the Hamming distance to any of the most recent fifty stored stories is below a small threshold, so the corpus does not collapse into thematic loops.


When validation fails, a single targeted retry is issued with a corrective prompt that names the specific failures, lowers the temperature slightly, and reuses the same knobs JSON. The retry result is accepted if it passes, or if it fails with strictly fewer issues than the original. If the model call itself fails — including the specific case where a <code>gpt-5</code> response is marked ''incomplete'' due to <code>max_output_tokens</code> — the pipeline retries once with a higher token budget, then attempts the configured fallback model, and finally falls back to a fixed three-sentence placeholder ("The moon borrowed a suitcase from a bewildered pigeon. …"). All failures surface to the user as Flask flash messages with appropriate severity.
When validation fails, a single targeted retry is issued with a corrective prompt that quotes the failing lines back at the model verbatim, names the specific failures, lowers the temperature slightly, and reuses the same knobs JSON. The retry result is accepted if it passes, or if it fails with strictly fewer issues than the original. If the model call itself fails — including the specific case where a <code>gpt-5</code> response is marked ''incomplete'' due to <code>max_output_tokens</code> — the pipeline retries once with a higher token budget, then attempts the configured fallback model, and finally falls back to a fixed three-sentence placeholder. All failures surface to the user as Flask flash messages with appropriate severity.


=== Persistence and logging ===
=== Persistence and logging ===


The validated story text is written to a new <code>Story</code> row together with the seeds and originating IP. Image generation is then dispatched in a background daemon thread so the HTTP response returns immediately. Two structured logs are appended in JSON-Lines format under the Flask instance folder:
The validated story text is written to a new <code>Story</code> row together with the seeds, the originating IP, and the JSON-encoded knobs object. Image generation is then dispatched in a background daemon thread so the HTTP response returns immediately. Two structured logs are appended in JSON-Lines format under the Flask instance folder:


* <code>instance/choices.log.jsonl</code> records, per story, the seeds, the knobs JSON, the selected temperature and top_p, the model actually used, the configured model, and the validation outcome (including reasons and whether a retry was attempted).
* <code>instance/choices.log.jsonl</code> records, per story, the seeds, the knobs JSON, the selected temperature and top_p, the model actually used, the configured model, and the validation outcome (including reasons and whether a retry was attempted).
* <code>instance/image.log.jsonl</code> records each image generation attempt with status (''success'', ''retry'', ''failed'', ''skipped''), error text, attempt count, image size, and image model.
* <code>instance/image.log.jsonl</code> records each image generation attempt with status (''success'', ''retry'', ''failed'', ''skipped''), error text, attempt count, image size, image model, and post-processing diagnostics such as the standard deviation of the output's luminance.
 
Both logs are rotated to a single ''<file>.1'' backup once they exceed a configurable byte threshold (default 2 MiB) so that long-running instances cannot fill the disk.


== Image generation ==
== Image generation ==
Line 95: Line 103:
Each story is paired with a square 1024×1024 illustration generated through the OpenAI image API.
Each story is paired with a square 1024×1024 illustration generated through the OpenAI image API.


=== Style cycling ===
=== Style library ===


The application maintains a counter file at <code>instance/image_style_cycle.txt</code> that walks deterministically through a library of fifteen monochrome engraving styles, including Victorian wood engraving, antique grimoire pen-and-ink, penny-dreadful frontispiece, steel engraving, copperplate etching, drypoint, mezzotint, aquatint-grained etching, scratchboard, Victorian scientific plate, woodcut, scanned 1860s book plate, fin-de-siècle symbolism, and Victorian reportage sketch. A threading lock guards the read-modify-write of the counter so that concurrent generations do not collide on the same style. If the counter file is missing or unreadable, a random style is chosen instead.
The application maintains a curated library of fifteen monochrome illustration styles, each anchored to one or more named historical artists or movements — for example wood engravings in the manner of Gustave Doré or Thomas Bewick, pen-and-ink in the manner of Aubrey Beardsley or Edward Gorey, drypoint in the manner of Käthe Kollwitz, aquatint in the manner of Goya's ''Caprichos'', a Victorian scientific plate after Ernst Haeckel, relief woodcut after Lynd Ward, silhouette cuts after Lotte Reiniger, and symbolist pen-and-ink after Alfred Kubin. Anchoring each entry to a real reference reliably moves the model closer to the desired register. The style for a given story is selected by hashing its primary key, so adjacent stories on the home grid do not visually collide and a regenerated image for the same story can vary by perturbing the salt. A legacy global counter file at <code>instance/image_style_cycle.txt</code> is retained for callers without a story id.


=== Prompt construction ===
=== Prompt construction ===


The image prompt is assembled from four blocks:
To keep the image prompt concrete enough for a diffusion model to act on, the three-sentence story is first reduced to a one- or two-sentence visual scene description by a cheap secondary call to a smaller model (<code>OPENAI_SCENE_MODEL</code>, default <code>gpt-4o-mini</code>); the helper names a single subject, a single setting, one source of light, and one telling object, and discards mood words and metaphor. The final image prompt is then assembled from a hard full-bleed directive, the selected style sentence, the scene description, and an avoid clause that explicitly forbids colour, painterly shading, photorealism, captions, watermarks, signatures, and any kind of border, frame, or decorative edge.
 
# A composition directive specifying a square 1:1 aspect ratio, full-bleed framing with no borders or margins, and a centred subject.
# The selected style sentence from the cycling library.
# A "scanned 1860s book plate" line that asks for slight ink unevenness and faint paper texture.
# The story text itself, truncated to 900 characters with an ellipsis if longer.
# A trailing "Avoid:" clause that explicitly forbids colour, grayscale wash, painterly shading, photorealism, 3D rendering, borders, frames, mats, vignettes, modern comic or anime style, halftone dots, captions, readable text, watermarks, and signatures.


=== Image API and retries ===
=== Image API and retries ===


The configured image model defaults to <code>gpt-image-1</code>; common typos such as <code>gpt-image-1.5</code> are normalised back to <code>gpt-image-1</code> at start-up. The output size is fixed at 1024×1024. Each generation tolerates up to three attempts (configurable via <code>IMAGE_RETRIES</code>) with exponential backoff starting at 1.5 seconds and capped at 10 seconds. If all attempts fail, a tiny placeholder GIF is written to disk under the name <code>story_<id>_placeholder.gif</code> and recorded as the story's image path so the front end always has something to display.
The configured image model defaults to <code>gpt-image-1</code>; common typos such as <code>gpt-image-1.5</code> are normalised back to <code>gpt-image-1</code> at start-up. The generation size is configurable through <code>OPENAI_IMAGE_GEN_SIZE</code> (default 1024×1024) and the post-processed final size is fixed at 1024×1024. Each generation tolerates up to three attempts (configurable via <code>IMAGE_RETRIES</code>) with exponential backoff starting at 1.5 seconds and capped at 10 seconds. If all attempts fail — or if a returned image is blank or near-uniform — a small inline SVG placeholder is written to disk and recorded as the story's image path so the front end always has something to display. A few legacy stories carry a one-pixel GIF placeholder of the same purpose; the application detects either by the presence of the substring <code>_placeholder.</code> in the path.


=== Post-processing ===
=== Post-processing ===
Line 120: Line 122:
* Computes a difference image against a uniform background of that colour, raises its contrast, and finds the bounding box of meaningful content.
* Computes a difference image against a uniform background of that colour, raises its contrast, and finds the bounding box of meaningful content.
* Crops away any uniform border of more than ten pixels on any side, padded by two pixels to avoid clipping ink.
* Crops away any uniform border of more than ten pixels on any side, padded by two pixels to avoid clipping ink.
* Resizes the result back to 1024×1024 using Lanczos resampling and re-encodes as optimised PNG.
* Resizes the result back to the final size using Lanczos resampling and re-encodes as optimised PNG.
* Computes the standard deviation of the resulting image's luminance; if the value falls below a small threshold, the image is treated as blank and the generation attempt is failed so a retry can be issued.


If Pillow is unavailable or any step fails, the original bytes are written through unchanged.
If Pillow is unavailable or any step fails, the original bytes are written through unchanged.
Line 132: Line 135:
=== Story detail ===
=== Story detail ===


Each story is reachable at <code>/story/<id></code> and is publicly viewable regardless of authorship. The page presents the three sentences alongside the illustration; for the administrator, controls are exposed to regenerate the image (<code>/regenerate-image/<id></code>) or delete the story (<code>/story/<id>/delete</code>). A polling endpoint at <code>/api/story/<id>/status</code> returns a JSON document indicating whether a real image has yet been written, the URL of the image (or placeholder), and a flag distinguishing the two; the front end uses this to swap a placeholder GIF for the final image once background generation completes.
Each story is reachable at <code>/story/<id></code> and is publicly viewable regardless of authorship. The page presents the three sentences alongside the illustration; for the administrator, controls are exposed to regenerate the image (<code>/regenerate-image/<id></code>) or delete the story (<code>/story/<id>/delete</code>). A polling endpoint at <code>/api/story/<id>/status</code> returns a JSON document indicating whether a real image has yet been written, the URL of the image (or placeholder), and a flag distinguishing the two; the front-end script polls with exponential backoff capped at eight seconds, gives up after a hard timeout, and writes status updates into an <code>aria-live</code> region for screen-reader accessibility. The story page also overrides the base template's [[Open Graph protocol|Open Graph]] block to advertise per-story metadata, including the generated illustration, when the page is shared on social media.


=== Archive ===
=== Archive ===


The archive (<code>/archive</code>) is a paginated chronological listing of all stories, nine per page, with previous/next navigation. There is no per-user filter — all stories are visible.
The archive (<code>/archive</code>) is a paginated chronological listing of all stories, nine per page, with previous/next navigation. There is no per-user filter — all stories are visible. The archive page sets <code>&lt;meta name="robots" content="noindex,follow"&gt;</code> to keep the long tail out of search-engine indexes while leaving individual story pages indexable.
 
=== Health endpoint ===
 
<code>/healthz</code> returns a small JSON document and an HTTP 200 when the application is fully configured and able to query the database, or HTTP 503 when the OpenAI API key is not set or the database round-trip fails. It is intended for use by uptime monitors and load balancers.


=== Authentication ===
=== Authentication ===


<code>/login</code> accepts only the e-mail address configured in <code>ADMIN_EMAIL</code> and verifies the password against the stored Werkzeug hash. <code>/logout</code> ends the session. <code>/signup</code> is intentionally disabled.
<code>/login</code> accepts only the e-mail address configured in <code>ADMIN_EMAIL</code> and verifies the password against the stored Werkzeug hash. <code>/logout</code> ends the session. <code>/signup</code> is intentionally disabled. All form-bearing pages carry a CSRF token rendered from the Flask-WTF helper, and an expired token is recovered with a flash message and a redirect rather than an HTTP error.


=== Tagline rotation ===
=== Tagline rotation ===
Line 155: Line 162:
Both limits are evaluated at form render time (to hide the form) and at form submission time (to short-circuit the POST with a flashed warning and a redirect). Authenticated users have no rate limits and no per-IP enforcement.
Both limits are evaluated at form render time (to hide the form) and at form submission time (to short-circuit the POST with a flashed warning and a redirect). Authenticated users have no rate limits and no per-IP enforcement.


The originating IP for limit accounting and for the <code>Story.ip_address</code> column is read from <code>X-Forwarded-For</code> when the request carries it (taking the first comma-separated value), and from <code>request.remote_addr</code> otherwise.
The originating IP for limit accounting and for the <code>Story.ip_address</code> column is read from the <code>X-Forwarded-For</code> request header only when the immediate peer is in a configurable trusted-proxy allow-list (<code>TRUSTED_PROXIES</code>, default loopback); otherwise the application uses <code>request.remote_addr</code>. This prevents trivial spoofing of the originating address when the application is exposed without a reverse proxy in front of it.
 
The application reads its configuration from environment variables loaded via python-dotenv:


== Deployment ==
== Deployment ==


The production deployment is an Ubuntu VPS running a systemd unit (<code>acbc.service</code>) that launches Gunicorn with three workers, bound to a Unix domain socket under the project directory. Nginx terminates HTTPS (provisioned by [[Let's Encrypt]] via Certbot), serves the <code>static/</code> directory directly with a one-year immutable cache header, and proxies all other requests to the Gunicorn socket. The service runs as the <code>django</code> system user out of <code>/home/django/acbc</code>.
The production deployment is an Ubuntu VPS running a systemd unit (<code>acbc.service</code>) that launches Gunicorn with three workers, bound to a Unix domain socket under the project directory. Nginx terminates HTTPS (provisioned by [[Let's Encrypt]] via Certbot), serves the <code>static/</code> directory directly with a one-year immutable cache header, and proxies all other requests to the Gunicorn socket. The service runs as the <code>django</code> system user out of <code>/home/django/acbc</code>. The application reads its configuration from environment variables loaded via python-dotenv.


== Security and authorisation ==
== Security and authorisation ==


* All POST endpoints require a CSRF token issued by Flask-WTF.
* Sign-ups are disabled; only the bootstrapped administrator account can authenticate.
* Sign-ups are disabled; only the bootstrapped administrator account can authenticate.
* Login attempts for any e-mail other than <code>ADMIN_EMAIL</code> are rejected without a database query.
* Login attempts for any e-mail other than <code>ADMIN_EMAIL</code> are rejected without a database query.
Line 170: Line 176:
* Story deletion requires either the administrator session or ownership of the row; image regeneration requires the administrator session.
* Story deletion requires either the administrator session or ownership of the row; image regeneration requires the administrator session.
* The story-status JSON endpoint returns no user-identifying information beyond the public fields rendered on the story page.
* The story-status JSON endpoint returns no user-identifying information beyond the public fields rendered on the story page.
* The originating-IP determination only honours <code>X-Forwarded-For</code> from trusted proxies.


== See also ==
== See also ==

Revision as of 15:23, 23 April 2026

Infobox
nameA Cabinet of Brief Curiosities
urlhttps://acbc.yusupov.cloud
developerMichel Vuijlsteke
released2025
genreAI-generated short-fiction application
languagePython
frameworkFlask 3.0
licenseMIT

A Cabinet of Brief Curiosities (abbreviated acbc) is a web application hosted at acbc.yusupov.cloud that generates illustrated three-sentence short stories in the style of H. P. Lovecraft. Each story is composed by a large language model from a structured set of randomised "knobs" and optional user-supplied seed words, and is paired with a black-and-white illustration produced by a generative image model and styled to resemble a 19th-century engraved book plate. The site's rotating tagline — drawn every thirty minutes from a pool of forty-five variants — frames the act of generation in deliberately archaic terms; the canonical opening reads "Dredge slivers of impossible worlds from the black gulfs of imagination and press them into trembling mortal words."

Technology stack

The application is built on Flask 3.0 with SQLite as its database, accessed through SQLAlchemy (Flask-SQLAlchemy 3.1).[1] Authentication is handled by Flask-Login and cross-site request forgery protection by Flask-WTF. It is deployed on an Ubuntu VPS behind Nginx with Gunicorn running over a Unix socket under a dedicated django system user, supervised by systemd. Additional dependencies include the OpenAI Python client for both text and image generation, Pillow for image post-processing, python-dotenv for environment configuration, and httpx as the underlying HTTP transport. The front end is rendered from Jinja templates using a small Bootstrap-derived stylesheet (static/style.css) and is registered as an installable progressive web app via static/manifest.json and a service worker (static/sw.js).

Data model

The database contains two tables.

User

User stores a single bootstrapped administrator account with an e-mail address, a Werkzeug password hash, and a creation timestamp. Sign-ups are disabled at the route level: the /signup endpoint flashes a notice and redirects to login. The administrator is created on application start from the ADMIN_EMAIL and ADMIN_PASSWORD environment variables if no matching row exists.

Story

Story stores the generated short fiction:

  • id (primary key), an optional user_id foreign key (null for guest submissions), the three-sentence story_text, the optional mood, nouns and verbs seeds supplied by the requester, an optional image_path (relative to the static folder, e.g. images/story_42.png), the originating ip_address (indexed), an indexed created_at timestamp, and a JSON-encoded knobs_json column that records the parameter set the model was given.

The schema is created on first run by SQLAlchemy. A small startup routine inspects the live table and issues an ALTER TABLE if a newer column (such as knobs_json) is missing, providing a lightweight migration path without an external migration framework. The same routine sets PRAGMA journal_mode=WAL on the SQLite database to allow concurrent reads while a background image-generation thread writes.

Story generation pipeline

Story generation is initiated by a POST to /generate and proceeds through several deterministic-and-random stages before the OpenAI call.

Seeds and knobs

A request may carry up to three optional free-text fields — noun, verb, and mood — which are passed to the model as "seeds" and validated for presence in the output. Around the seeds, the application constructs a compact JSON knobs object that nudges the model along several axes:

Dimension Pool size Examples
Perspective 4 first, second, third, omniscient
Structure 8 Discovery → Investigation → Revelation; Object → Rumor → Catastrophe; Signal → Interpretation → Realization; …
Time 8 (forced ambiguous) Victorian era, distant past, mythic period, interwar, near-future of obsolete technology
Location 50 lighthouse, salt marsh, foundry, signal box, scriptorium, observatory, shipbreaker's yard, …
Situation 32 during a storm, at low tide, during a blackout, on the eve of demolition, while the clock refuses to strike, …
Lexical palette 53 nautical, horological, astronomical, archival, cartographic, mycological, glaciological, heraldic, …
Style constraint 5 include exactly one short line of dialogue; include a question; avoid the words "shadow" and "dim"; …
Sentence pattern 3 short → long → medium; medium → short → long; long → medium → short
Theme (only when no seeds) 18 forbidden knowledge, cosmic entities, inherited curses, mathematical theorems, astronomical observations, …

The knobs object always carries the seeds and at most two additional non-seed dimensions. Three of the knobs — perspective, structure, and time — are selected deterministically from a SHA-256 hash of a salt composed of the current 30-minute time bucket, the requester's IP address, and the seed words; this guarantees that the same client requesting the same seeds inside the same half-hour window draws the same priority knobs. The previous story's knobs for that IP are read from the knobs_json column (falling back to the JSONL choices log for older rows) and avoided where possible. Time is hard-pinned to ambiguous to suppress dated or era-specific references in the output. Lexicon and constraint values are sampled with the regular pseudo-random generator. Once all priority knobs are forced into the object, any non-priority extras are dropped at random until at most two non-seed knobs remain.

System rules

The system prompt is composed entirely of positive instructions: it asks for exactly three sentences in a grave Lovecraftian register, with no titles, lists, numbering, or blank lines, and embeds two short hand-written exemplar stories so the model can imitate length, cadence, and concreteness rather than reason in the negative. The application then validates the response against a separate banned-word list rather than naming forbidden terms in the prompt itself, in order to avoid the well-documented tendency of language models to reproduce items they are explicitly told to avoid.

Model selection

The default text model is GPT-4o (configurable via OPENAI_MODEL) with a configurable fallback (OPENAI_FALLBACK_MODEL, also GPT-4o by default). A helper function inspects the installed OpenAI SDK at request time: if the configured model is in the gpt-5 family but the SDK does not expose the Responses API, the call silently falls back to a chat-compatible model. When the Responses API is available it is preferred for all models, with the system rules supplied via the explicit instructions field. For gpt-5 family models the call additionally injects a reasoning object (defaulting to effort: low but configurable via OPENAI_REASONING_EFFORT), drops the temperature and top_p parameters (which those models reject), and grants a higher max_output_tokens budget (1600 by default, raised to 2400 on retry) to absorb reasoning-token consumption. For non-reasoning models, temperature is sampled uniformly from a narrow [0.78, 0.92] interval and top_p is fixed at 0.9, deliberately tighter than typical creative-writing defaults to keep tone consistent. Where the Responses API is unavailable, the call degrades to Chat Completions.

Validation and retry

After the call returns, the output is validated by a checker that enforces:

  • Exactly three non-empty lines.
  • No line beginning with any of a list of banned opening prefixes ("In the", "Beneath", "Under the", "Within the", and so on).
  • Absence of a curated list of overused or out-of-register words and phrases ("eldritch", "cyclopean", "loathsome", any inflection of tentacle, and similar).
  • Absence of anachronisms unsuited to the desired ambiguous-historical tone (kilometres, kilograms, GPS coordinates, e-mail addresses, four-digit years, references to Wi-Fi).
  • A per-sentence word count in the [8, 60] range.
  • One terminal punctuation mark per line.

If the request supplied seeds, a second pass verifies that the noun appears in the text — with allowance for irregular plurals — and that the verb appears in any common inflection.

A separate near-duplicate check computes a 64-bit SimHash of the candidate text and rejects it if the Hamming distance to any of the most recent fifty stored stories is below a small threshold, so the corpus does not collapse into thematic loops.

When validation fails, a single targeted retry is issued with a corrective prompt that quotes the failing lines back at the model verbatim, names the specific failures, lowers the temperature slightly, and reuses the same knobs JSON. The retry result is accepted if it passes, or if it fails with strictly fewer issues than the original. If the model call itself fails — including the specific case where a gpt-5 response is marked incomplete due to max_output_tokens — the pipeline retries once with a higher token budget, then attempts the configured fallback model, and finally falls back to a fixed three-sentence placeholder. All failures surface to the user as Flask flash messages with appropriate severity.

Persistence and logging

The validated story text is written to a new Story row together with the seeds, the originating IP, and the JSON-encoded knobs object. Image generation is then dispatched in a background daemon thread so the HTTP response returns immediately. Two structured logs are appended in JSON-Lines format under the Flask instance folder:

  • instance/choices.log.jsonl records, per story, the seeds, the knobs JSON, the selected temperature and top_p, the model actually used, the configured model, and the validation outcome (including reasons and whether a retry was attempted).
  • instance/image.log.jsonl records each image generation attempt with status (success, retry, failed, skipped), error text, attempt count, image size, image model, and post-processing diagnostics such as the standard deviation of the output's luminance.

Both logs are rotated to a single <file>.1 backup once they exceed a configurable byte threshold (default 2 MiB) so that long-running instances cannot fill the disk.

Image generation

Each story is paired with a square 1024×1024 illustration generated through the OpenAI image API.

Style library

The application maintains a curated library of fifteen monochrome illustration styles, each anchored to one or more named historical artists or movements — for example wood engravings in the manner of Gustave Doré or Thomas Bewick, pen-and-ink in the manner of Aubrey Beardsley or Edward Gorey, drypoint in the manner of Käthe Kollwitz, aquatint in the manner of Goya's Caprichos, a Victorian scientific plate after Ernst Haeckel, relief woodcut after Lynd Ward, silhouette cuts after Lotte Reiniger, and symbolist pen-and-ink after Alfred Kubin. Anchoring each entry to a real reference reliably moves the model closer to the desired register. The style for a given story is selected by hashing its primary key, so adjacent stories on the home grid do not visually collide and a regenerated image for the same story can vary by perturbing the salt. A legacy global counter file at instance/image_style_cycle.txt is retained for callers without a story id.

Prompt construction

To keep the image prompt concrete enough for a diffusion model to act on, the three-sentence story is first reduced to a one- or two-sentence visual scene description by a cheap secondary call to a smaller model (OPENAI_SCENE_MODEL, default gpt-4o-mini); the helper names a single subject, a single setting, one source of light, and one telling object, and discards mood words and metaphor. The final image prompt is then assembled from a hard full-bleed directive, the selected style sentence, the scene description, and an avoid clause that explicitly forbids colour, painterly shading, photorealism, captions, watermarks, signatures, and any kind of border, frame, or decorative edge.

Image API and retries

The configured image model defaults to gpt-image-1; common typos such as gpt-image-1.5 are normalised back to gpt-image-1 at start-up. The generation size is configurable through OPENAI_IMAGE_GEN_SIZE (default 1024×1024) and the post-processed final size is fixed at 1024×1024. Each generation tolerates up to three attempts (configurable via IMAGE_RETRIES) with exponential backoff starting at 1.5 seconds and capped at 10 seconds. If all attempts fail — or if a returned image is blank or near-uniform — a small inline SVG placeholder is written to disk and recorded as the story's image path so the front end always has something to display. A few legacy stories carry a one-pixel GIF placeholder of the same purpose; the application detects either by the presence of the substring _placeholder. in the path.

Post-processing

Successfully returned PNG bytes are passed through a Pillow-based post-processor that:

  • Estimates the background colour from the four corner patches.
  • Computes a difference image against a uniform background of that colour, raises its contrast, and finds the bounding box of meaningful content.
  • Crops away any uniform border of more than ten pixels on any side, padded by two pixels to avoid clipping ink.
  • Resizes the result back to the final size using Lanczos resampling and re-encodes as optimised PNG.
  • Computes the standard deviation of the resulting image's luminance; if the value falls below a small threshold, the image is treated as blank and the generation attempt is failed so a retry can be issued.

If Pillow is unavailable or any step fails, the original bytes are written through unchanged.

Public interface

Home

The home page (/) renders the generation form together with a gallery of the five most recent stories that successfully produced an image. For unauthenticated visitors, the form is hidden and replaced by a rotating notice (one of twenty-two phrasings, selected from a SHA-256 hash of the visitor's IP and a six-hour bucket) when the visitor has already generated a story in the last 24 hours or when the site-wide guest cap has been reached. A footer-level statistics block displays the number of guest stories in the last 24 hours, the number of stories created by signed-in users in the same window, the all-time total, and the time elapsed since the most recent story.

Story detail

Each story is reachable at /story/<id> and is publicly viewable regardless of authorship. The page presents the three sentences alongside the illustration; for the administrator, controls are exposed to regenerate the image (/regenerate-image/<id>) or delete the story (/story/<id>/delete). A polling endpoint at /api/story/<id>/status returns a JSON document indicating whether a real image has yet been written, the URL of the image (or placeholder), and a flag distinguishing the two; the front-end script polls with exponential backoff capped at eight seconds, gives up after a hard timeout, and writes status updates into an aria-live region for screen-reader accessibility. The story page also overrides the base template's Open Graph block to advertise per-story metadata, including the generated illustration, when the page is shared on social media.

Archive

The archive (/archive) is a paginated chronological listing of all stories, nine per page, with previous/next navigation. There is no per-user filter — all stories are visible. The archive page sets <meta name="robots" content="noindex,follow"> to keep the long tail out of search-engine indexes while leaving individual story pages indexable.

Health endpoint

/healthz returns a small JSON document and an HTTP 200 when the application is fully configured and able to query the database, or HTTP 503 when the OpenAI API key is not set or the database round-trip fails. It is intended for use by uptime monitors and load balancers.

Authentication

/login accepts only the e-mail address configured in ADMIN_EMAIL and verifies the password against the stored Werkzeug hash. /logout ends the session. /signup is intentionally disabled. All form-bearing pages carry a CSRF token rendered from the Flask-WTF helper, and an expired token is recovered with a flash message and a redirect rather than an HTTP error.

Tagline rotation

The base template injects a "current tagline" string into every response. The selection is deterministic on the current 30-minute interval since the Unix epoch: the interval timestamp is used as a seed for the standard library random module, which then picks one of forty-five variant phrasings. All visitors served within the same half-hour see the same tagline; the tagline rotates without any database state.

Rate limiting

Rate limiting is enforced for unauthenticated visitors only:

  • Per-IP daily limit: any IP that has produced a story within the last 24 hours is blocked from generating another (one story per visitor per day).
  • Site-wide guest cap: the total number of guest-authored stories in the last 24 hours must not exceed GUEST_DAILY_CAP (default 24). When the cap is reached, the form is hidden for all guests until older stories age out.

Both limits are evaluated at form render time (to hide the form) and at form submission time (to short-circuit the POST with a flashed warning and a redirect). Authenticated users have no rate limits and no per-IP enforcement.

The originating IP for limit accounting and for the Story.ip_address column is read from the X-Forwarded-For request header only when the immediate peer is in a configurable trusted-proxy allow-list (TRUSTED_PROXIES, default loopback); otherwise the application uses request.remote_addr. This prevents trivial spoofing of the originating address when the application is exposed without a reverse proxy in front of it.

Deployment

The production deployment is an Ubuntu VPS running a systemd unit (acbc.service) that launches Gunicorn with three workers, bound to a Unix domain socket under the project directory. Nginx terminates HTTPS (provisioned by Let's Encrypt via Certbot), serves the static/ directory directly with a one-year immutable cache header, and proxies all other requests to the Gunicorn socket. The service runs as the django system user out of /home/django/acbc. The application reads its configuration from environment variables loaded via python-dotenv.

Security and authorisation

  • All POST endpoints require a CSRF token issued by Flask-WTF.
  • Sign-ups are disabled; only the bootstrapped administrator account can authenticate.
  • Login attempts for any e-mail other than ADMIN_EMAIL are rejected without a database query.
  • The image-deletion helper refuses to remove any path that does not begin with images/ or that, after canonicalisation, falls outside the configured upload folder, providing protection against path-traversal in stored data.
  • Story deletion requires either the administrator session or ownership of the row; image regeneration requires the administrator session.
  • The story-status JSON endpoint returns no user-identifying information beyond the public fields rendered on the story page.
  • The originating-IP determination only honours X-Forwarded-For from trusted proxies.

See also

References

  1. requirements.txt in the project repository pins Flask 3.0.3, Flask-Login 0.6.3, Flask-SQLAlchemy 3.1.1, Flask-WTF ≥1.2, python-dotenv 1.0.1, openai ≥1.50, httpx 0.27.2, and Pillow ≥10.