Distorted images

#3
by HarryML - opened

The model produces distorted images..can you guide the exact process for jupyter notebook?

Photoroom org

Thanks a lot for the feedback. Could you share some examples of distorted images and the prompts and parameters you used to generate them?

Wont be able to upload the image as will have to run the whole setup again...but the image produced has distorted limbs, distorted outline, distorted face.
The prompt was: "a woman in a long coat sitting in a chair, classy, elegant, fashion photography, full body"
And the code used was from the one mentioned in the model card.:-

from diffusers.pipelines.prx import PRXPipeline
import torch

pipe = PRXPipeline.from_pretrained(
"Photoroom/prx-1024-t2i-beta",
torch_dtype=torch.bfloat16
).to("cuda")

prompt = "A front-facing portrait of a lion in the golden savanna at sunset"
image = pipe(prompt, num_inference_steps=28, guidance_scale=5.0).images[0]
image.save("lion.png")

Photoroom org

Hi! Thanks for the feedback.
This mainly come from two things:

  1. The prompt is a bit short and for now the model behaves better with very detailed prompts
  2. This preview is a bit undertrained. We are currently fine-tuning it with shorter prompts, and hopefully, it will both increase global quality and allow dealing with shorter prompts

Could you try with a larger number of steps (50?) and an extended prompt like this one:

A sophisticated woman in her late twenties sits poised in a luxurious velvet armchair, exuding timeless elegance. She wears a floor-length cashmere coat in deep charcoal gray, tailored to perfection with clean lines, a subtle belt at the waist, and wide lapels. The coat drapes beautifully over the chair, pooling slightly at her feet. Underneath, a hint of a silk slip dress in champagne tones is visible. Her legs are crossed gracefully, one foot in sleek pointed-toe leather heels just visible beneath the coat's hem.
She sits in a three-quarter pose, her body angled toward the camera with her face turned slightly, giving a confident yet approachable gaze. One hand rests elegantly on the chair's arm, the other placed delicately in her lap. Her expression is serene and self-assured, with natural, minimal makeup highlighting her features. Her hair is styled in a polished low bun or sleek shoulder-length waves.
The setting is a minimalist, high-end studio with soft, diffused natural lighting streaming from a large window to the side, creating gentle shadows and dimension. The background is clean—either a smooth cream wall or subtle textured backdrop. The chair is positioned on a polished concrete floor or plush neutral rug.
Shot in full body frame with a professional fashion photography aesthetic: sharp focus, shallow depth of field, shot on medium format film or high-end digital camera, 85mm lens, f/2.8, editorial quality, Vogue-style composition, muted color palette with rich textures.

50
Got this result. Its better than the shorter prompt version but still needs some improvement i believe.

Anyways,I do appreaciate your help.

Sign up or log in to comment