AI图像生成集成:在Java应用中接入Stable Diffusion和DALL-E
2026/6/29大约 23 分钟图像生成DALL-EStable DiffusionSpring AIJava文生图
AI图像生成集成:在Java应用中接入Stable Diffusion和DALL-E
设计师外包费从8万降到3000元
2025年9月,杭州某中型电商公司技术总监老赵找到我,说了一个让他觉得不可思议的事情。
他们公司每个月在设计师外包上花8万元,主要用于商品场景图的创作。每个新品上线,需要3-5张不同场景的展示图(白底图、生活场景图、节日氛围图),一张图的外包价格是300-800元不等。
算下来,他们每年光在商品场景图上就要花100万。
一个工程师花了两周时间,接入了DALL-E 3,上线了一套AI商品图生成系统。
三个月后的数据让所有人震惊:
- 设计师外包费:从8万/月 → 3000元/月(降低96%)
- 图片生成速度:从3-5天 → 30秒
- 图片数量:从人均2-3张 → 10-20张选图(选择余地更大)
- 设计师被裁?不,他们从画图转为做AI生成的质量审核和精修
今天我把这套系统的技术实现全部还原,带你从零搭建一个生产级AI图片生成服务。
第一章:图像生成API全景对比
1.1 主流API横向评测
radar
title 图像生成API能力对比
axis 图片质量, Prompt理解, 风格多样性, 生成速度, 价格优势, API稳定性
DALL-E 3: 88, 92, 80, 75, 70, 95
Midjourney API: 95, 88, 95, 60, 60, 70
Stable Diffusion XL: 85, 75, 92, 85, 95, 80
Ideogram 2.0: 82, 85, 78, 80, 75, 85
Flux 1.1 Pro: 90, 88, 88, 78, 78, 82| 服务 | 质量 | 中文Prompt | 价格/张 | API可用 | 适用场景 |
|---|---|---|---|---|---|
| DALL-E 3 (1024x1024) | ★★★★☆ | ✅ 支持 | $0.040 | ✅ | 通用商业图片 |
| DALL-E 3 HD | ★★★★★ | ✅ 支持 | $0.080 | ✅ | 高质量宣传图 |
| Midjourney API | ★★★★★ | ✅ 有限 | $0.10+ | 第三方 | 艺术创作 |
| Stable Diffusion API(Stability AI) | ★★★★☆ | ✅ 有限 | $0.003-0.02 | ✅ | 批量生成 |
| Flux 1.1 Pro(黑森林实验室) | ★★★★★ | ✅ | $0.04 | ✅ | 照片级真实感 |
| 阿里云通义万相 | ★★★★☆ | ✅ 最佳 | ¥0.14 | ✅ | 国内合规 |
| 百度文心一格 | ★★★☆☆ | ✅ 最佳 | ¥0.08 | ✅ | 国内合规 |
| 自托管SD(ComfyUI) | ★★★★☆ | 需微调 | $0(电费) | REST API | 私有化部署 |
1.2 选型决策流程
第二章:Spring AI集成DALL-E 3
2.1 Maven依赖与配置
<!-- pom.xml -->
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- OSS存储 -->
<dependency>
<groupId>com.aliyun.oss</groupId>
<artifactId>aliyun-sdk-oss</artifactId>
<version>3.17.2</version>
</dependency>
<!-- HTTP客户端(下载生成的图片) -->
<dependency>
<groupId>org.apache.httpcomponents.client5</groupId>
<artifactId>httpclient5</artifactId>
<version>5.3.1</version>
</dependency>
</dependencies># application.yml
spring:
ai:
openai:
api-key: ${OPENAI_API_KEY}
image:
options:
model: dall-e-3
quality: standard # standard / hd
style: vivid # vivid(鲜艳)/ natural(自然)
size: 1024x1024 # 1024x1024 / 1792x1024 / 1024x1792
response-format: url # url(链接,过期后失效)/ b64_json(Base64)
n: 1 # DALL-E 3每次只能生成1张
# 图像生成业务配置
image:
generation:
max-concurrent: 5 # 最大并发生成数
timeout-seconds: 60 # 单次生成超时
retry-max-attempts: 2 # 失败重试次数
cache-enabled: true
cache-ttl-hours: 24
storage:
oss-bucket: your-bucket
cdn-domain: https://cdn.your-domain.com
path-prefix: ai-images2.2 DALL-E 3核心服务实现
package com.example.image.service;
import org.springframework.ai.image.*;
import org.springframework.ai.openai.OpenAiImageModel;
import org.springframework.ai.openai.OpenAiImageOptions;
import org.springframework.stereotype.Service;
import lombok.extern.slf4j.Slf4j;
import java.net.URL;
import java.time.Instant;
/**
* DALL-E 3 图像生成服务
*/
@Slf4j
@Service
public class DalleImageGenerationService {
private final OpenAiImageModel imageModel;
private final ImageStorageService storageService;
private final ImageCacheService cacheService;
public DalleImageGenerationService(OpenAiImageModel imageModel,
ImageStorageService storageService,
ImageCacheService cacheService) {
this.imageModel = imageModel;
this.storageService = storageService;
this.cacheService = cacheService;
}
/**
* 生成图片(标准调用)
*
* @param prompt 图片描述提示词
* @param options 生成选项(尺寸、质量、风格)
* @return 生成结果(包含存储后的URL)
*/
public ImageGenerationResult generate(String prompt, ImageGenerationOptions options) {
// 检查缓存(相同prompt避免重复生成)
String cacheKey = buildCacheKey(prompt, options);
Optional<ImageGenerationResult> cached = cacheService.get(cacheKey);
if (cached.isPresent()) {
log.info("图片生成缓存命中,prompt摘要: {}", truncate(prompt, 50));
return cached.get();
}
Instant start = Instant.now();
// 构建请求选项
OpenAiImageOptions imageOptions = OpenAiImageOptions.builder()
.model("dall-e-3")
.quality(options.isHighQuality() ? "hd" : "standard")
.style(options.isNaturalStyle() ? "natural" : "vivid")
.width(options.getWidth())
.height(options.getHeight())
.responseFormat("b64_json") // 使用Base64避免URL过期问题
.build();
ImagePrompt imagePrompt = new ImagePrompt(prompt, imageOptions);
try {
ImageResponse response = imageModel.call(imagePrompt);
Image image = response.getResult().getOutput();
long generationMs = Instant.now().toEpochMilli() - start.toEpochMilli();
// 将Base64图片数据上传到OSS
byte[] imageData = decodeBase64Image(image.getB64Json());
String permanentUrl = storageService.storeImage(imageData, "png", prompt);
// 获取DALL-E 3改进后的revised_prompt(模型可能会修改原prompt)
String revisedPrompt = image.getRevisedPrompt();
log.info("图片生成成功,耗时: {}ms,原prompt: {},revised: {}",
generationMs, truncate(prompt, 50), truncate(revisedPrompt, 50));
ImageGenerationResult result = ImageGenerationResult.builder()
.originalPrompt(prompt)
.revisedPrompt(revisedPrompt)
.imageUrl(permanentUrl)
.generationMs(generationMs)
.model("dall-e-3")
.quality(options.isHighQuality() ? "hd" : "standard")
.size(options.getWidth() + "x" + options.getHeight())
.build();
// 缓存结果
cacheService.put(cacheKey, result);
return result;
} catch (Exception e) {
log.error("图片生成失败,prompt: {}", truncate(prompt, 100), e);
// 判断错误类型
if (e.getMessage() != null && e.getMessage().contains("content_policy_violation")) {
throw new BusinessException(ErrorCode.IMAGE_CONTENT_VIOLATION,
"图片内容不符合使用政策,请修改描述");
}
throw new BusinessException(ErrorCode.IMAGE_GENERATION_FAILED,
"图片生成失败,请稍后重试");
}
}
/**
* 批量生成图片(并发控制)
*/
public List<ImageGenerationResult> generateBatch(List<String> prompts,
ImageGenerationOptions options,
int maxConcurrent) {
log.info("批量图片生成,共 {} 张,最大并发: {}", prompts.size(), maxConcurrent);
// 使用信号量控制并发(避免超出API限速)
Semaphore semaphore = new Semaphore(maxConcurrent);
List<CompletableFuture<ImageGenerationResult>> futures = prompts.stream()
.map(prompt -> CompletableFuture.supplyAsync(() -> {
try {
semaphore.acquire();
return generate(prompt, options);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException("批量生成被中断");
} finally {
semaphore.release();
}
}))
.collect(Collectors.toList());
// 等待所有完成,收集结果
return futures.stream()
.map(future -> {
try {
return future.get(60, TimeUnit.SECONDS);
} catch (Exception e) {
log.error("某张图片生成失败", e);
return ImageGenerationResult.failed(e.getMessage());
}
})
.collect(Collectors.toList());
}
private byte[] decodeBase64Image(String base64Data) {
return Base64.getDecoder().decode(base64Data);
}
private String buildCacheKey(String prompt, ImageGenerationOptions options) {
String keySource = prompt + options.getWidth() + options.getHeight() +
options.isHighQuality() + options.isNaturalStyle();
return "img:dalle3:" + DigestUtils.md5DigestAsHex(keySource.getBytes(StandardCharsets.UTF_8));
}
private String truncate(String text, int maxLength) {
if (text == null) return "";
return text.length() > maxLength ? text.substring(0, maxLength) + "..." : text;
}
}第三章:图像生成提示词工程
3.1 提示词质量对结果的影响
相同的商品,不同质量的提示词,生成结果差异巨大:
/**
* 提示词工程:构建高质量商品图生成提示词
*/
@Service
public class ProductImagePromptBuilder {
/**
* 质量等级示例对比:
*
* 等级1(差):一个白色T恤
* 等级2(普通):白色棉质T恤,产品图,白色背景
* 等级3(良好):白色纯棉圆领T恤,专业电商产品摄影,白色背景,柔和阴影,高分辨率
* 等级4(优秀):以下buildProductPrompt方法的输出
*/
/**
* 构建电商商品图提示词
*/
public String buildProductPrompt(ProductImageRequest request) {
StringBuilder prompt = new StringBuilder();
// 1. 商品描述(核心)
prompt.append(request.getProductDescription());
// 2. 场景设定
String scene = switch (request.getSceneType()) {
case WHITE_BACKGROUND -> "professional product photography, pure white background, studio lighting";
case LIFESTYLE -> "lifestyle photography, natural home environment, warm lighting, shallow depth of field";
case HOLIDAY_CHRISTMAS -> "Christmas holiday atmosphere, festive decorations, warm bokeh lights background";
case HOLIDAY_CNY -> "Chinese New Year atmosphere, red and gold decorations, festive background";
case OUTDOOR -> "outdoor natural environment, bright daylight, fresh background";
case MINIMALIST -> "minimalist style, clean background, geometric shadows";
};
prompt.append(", ").append(scene);
// 3. 摄影技术参数
prompt.append(", ");
prompt.append(buildPhotographyStyle(request.getQuality()));
// 4. 负面描述(告诉模型避免什么)
// 注意:DALL-E 3不直接支持negative prompt,但可以用正面描述规避
prompt.append(", no watermarks, no text overlay, clean composition");
// 5. 如果有参考颜色
if (request.getPrimaryColor() != null) {
prompt.append(", main color: ").append(request.getPrimaryColor());
}
log.debug("生成提示词: {}", prompt);
return prompt.toString();
}
/**
* 构建摄影风格描述
*/
private String buildPhotographyStyle(ImageQuality quality) {
return switch (quality) {
case STANDARD -> "high quality product photo, 4K resolution, perfect focus";
case HIGH -> "professional commercial photography, 8K ultra-high resolution, " +
"perfect lighting setup, magazine quality, award-winning photography";
case ARTISTIC -> "artistic product photography, creative composition, " +
"dramatic lighting, editorial style";
};
}
/**
* 中文场景图提示词构建(使用通义万相时)
*/
public String buildChineseProductPrompt(ProductImageRequest request) {
StringBuilder prompt = new StringBuilder();
prompt.append(request.getProductDescriptionChinese());
String sceneChinese = switch (request.getSceneType()) {
case WHITE_BACKGROUND -> ",专业商品图,纯白背景,摄影棚打光,高清晰度";
case LIFESTYLE -> ",生活场景,温馨家居环境,自然光线,浅景深虚化背景";
case HOLIDAY_CNY -> ",新年氛围,红色金色喜庆装饰,节日背景,温暖灯光";
default -> ",产品展示图,专业摄影";
};
prompt.append(sceneChinese);
prompt.append(",无水印,无文字,高品质,电商主图风格");
return prompt.toString();
}
/**
* 自动改写提示词(针对可能触发内容审核的描述)
* 使用LLM预处理提示词,确保合规
*/
public String sanitizeAndEnhancePrompt(ChatClient chatClient, String rawPrompt) {
String systemPrompt = """
你是一个专业的AI图像提示词优化师。
你的任务:
1. 检查提示词是否包含违规内容(暴力、色情、版权侵犯等),如有则移除
2. 将中文描述转换为更具体的英文摄影描述
3. 补充专业摄影参数(光线、构图、分辨率等)
4. 确保最终提示词能生成高质量的商业用途图片
只返回优化后的提示词,不要解释。
""";
return chatClient.prompt()
.system(systemPrompt)
.user("原始提示词:" + rawPrompt)
.call()
.content();
}
}3.2 场景化提示词模板库
/**
* 预设提示词模板库
* 经过测试的高质量提示词
*/
@Component
public class PromptTemplateLibrary {
/**
* 电商商品图模板集合
*/
public static final Map<String, String> ECOMMERCE_TEMPLATES = Map.of(
"product_white_bg",
"{product}, isolated on pure white background, professional product photography, " +
"soft box lighting from left and right, sharp focus, 4K resolution, " +
"no shadows, commercial quality",
"product_lifestyle",
"{product}, lifestyle photography, modern living room setting, " +
"natural daylight from window, shallow depth of field, " +
"warm color temperature, photorealistic, magazine editorial style",
"product_hero_shot",
"{product}, dramatic hero product shot, dark moody background, " +
"single key light from top-left, long shadows, high contrast, " +
"luxury brand aesthetic, cinematic composition",
"product_holiday",
"{product}, Christmas holiday setting, pine branches with snow, " +
"golden fairy lights bokeh background, warm orange glow, " +
"gift presentation style, cozy atmosphere",
"product_minimalist",
"{product}, minimalist product photography, light gray background, " +
"geometric minimal shadows, symmetrical composition, " +
"Scandinavian design aesthetic, clean and modern"
);
/**
* 将模板与商品描述合并
*/
public String applyTemplate(String templateKey, String productDescription) {
String template = ECOMMERCE_TEMPLATES.get(templateKey);
if (template == null) {
throw new BusinessException(ErrorCode.TEMPLATE_NOT_FOUND,
"模板不存在: " + templateKey);
}
return template.replace("{product}", productDescription);
}
/**
* 通过AI动态生成适合特定产品的最佳提示词
*/
public String generateOptimalPrompt(ChatClient chatClient,
String productCategory,
String productDescription,
String targetAudience) {
String prompt = String.format("""
作为商业摄影专家,请为以下商品生成最佳的AI图片生成提示词:
商品类目:%s
商品描述:%s
目标客户:%s
要求:
1. 提示词用英文
2. 包含:摄影风格、光线设置、背景、构图要素
3. 适合电商主图使用
4. 避免版权相关描述
5. 长度200-400词
只输出提示词本身,不需要解释。
""", productCategory, productDescription, targetAudience);
return chatClient.prompt().user(prompt).call().content();
}
}第四章:批量图片生成与并发控制
4.1 生产级批量生成系统
package com.example.image.service;
import org.springframework.stereotype.Service;
import lombok.extern.slf4j.Slf4j;
import java.util.concurrent.*;
/**
* 批量图片生成服务
* 支持任务队列、进度跟踪、失败重试
*/
@Slf4j
@Service
public class BatchImageGenerationService {
private final DalleImageGenerationService dalleService;
private final BatchJobRepository jobRepository;
private final MeterRegistry meterRegistry;
// 全局并发控制:DALL-E 3 API限速为5张/分钟(免费层)/ 50张/分钟(付费)
private final Semaphore rateLimitSemaphore;
// 专用线程池
private final ExecutorService executorService;
public BatchImageGenerationService(DalleImageGenerationService dalleService,
BatchJobRepository jobRepository,
MeterRegistry meterRegistry,
@Value("${image.generation.max-concurrent:5}") int maxConcurrent) {
this.dalleService = dalleService;
this.jobRepository = jobRepository;
this.meterRegistry = meterRegistry;
this.rateLimitSemaphore = new Semaphore(maxConcurrent);
this.executorService = Executors.newFixedThreadPool(maxConcurrent * 2);
}
/**
* 提交批量生成任务
* 返回jobId,客户端轮询进度
*/
public BatchJob submitBatchJob(List<BatchImageTask> tasks, String submittedBy) {
BatchJob job = BatchJob.builder()
.jobId(UUID.randomUUID().toString())
.totalTasks(tasks.size())
.completedTasks(0)
.failedTasks(0)
.status(BatchJobStatus.PENDING)
.submittedBy(submittedBy)
.submittedAt(Instant.now())
.tasks(tasks)
.build();
jobRepository.save(job);
// 异步执行
CompletableFuture.runAsync(() -> executeBatchJob(job), executorService);
log.info("批量生成任务已提交,jobId: {},共{}张图片,提交人: {}",
job.getJobId(), tasks.size(), submittedBy);
return job;
}
/**
* 执行批量任务
*/
private void executeBatchJob(BatchJob job) {
job.setStatus(BatchJobStatus.RUNNING);
job.setStartedAt(Instant.now());
jobRepository.save(job);
List<CompletableFuture<Void>> futures = new ArrayList<>();
AtomicInteger completed = new AtomicInteger(0);
AtomicInteger failed = new AtomicInteger(0);
for (BatchImageTask task : job.getTasks()) {
CompletableFuture<Void> future = CompletableFuture.runAsync(() -> {
try {
// 获取并发许可(限速)
rateLimitSemaphore.acquire();
try {
// 执行生成(含重试)
ImageGenerationResult result = generateWithRetry(task, 2);
// 更新任务状态
task.setStatus(TaskStatus.COMPLETED);
task.setImageUrl(result.getImageUrl());
task.setRevisedPrompt(result.getRevisedPrompt());
int completedCount = completed.incrementAndGet();
log.info("任务完成 {}/{}, jobId: {}, taskId: {}",
completedCount, job.getTotalTasks(), job.getJobId(), task.getTaskId());
} finally {
rateLimitSemaphore.release();
}
} catch (Exception e) {
task.setStatus(TaskStatus.FAILED);
task.setErrorMessage(e.getMessage());
int failedCount = failed.incrementAndGet();
log.error("任务失败,jobId: {}, taskId: {}, 错误: {}",
job.getJobId(), task.getTaskId(), e.getMessage());
}
// 更新Job进度
job.setCompletedTasks(completed.get());
job.setFailedTasks(failed.get());
jobRepository.save(job);
}, executorService);
futures.add(future);
}
// 等待所有任务完成
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
// 更新Job最终状态
job.setStatus(failed.get() == 0 ? BatchJobStatus.COMPLETED : BatchJobStatus.PARTIAL_FAILED);
job.setCompletedAt(Instant.now());
jobRepository.save(job);
log.info("批量任务完成,jobId: {},成功: {},失败: {}",
job.getJobId(), completed.get(), failed.get());
// 发送完成通知(邮件/消息推送)
notifyJobCompletion(job);
}
/**
* 带重试的图片生成
*/
private ImageGenerationResult generateWithRetry(BatchImageTask task, int maxRetries) {
int attempt = 0;
Exception lastException = null;
while (attempt <= maxRetries) {
try {
if (attempt > 0) {
// 重试前等待(指数退避)
Thread.sleep(2000L * attempt);
log.info("重试第{}次,taskId: {}", attempt, task.getTaskId());
}
return dalleService.generate(
task.getPrompt(),
ImageGenerationOptions.builder()
.width(task.getWidth())
.height(task.getHeight())
.highQuality(task.isHighQuality())
.build()
);
} catch (BusinessException e) {
// 内容违规不重试
if (e.getCode() == ErrorCode.IMAGE_CONTENT_VIOLATION) {
throw e;
}
lastException = e;
} catch (Exception e) {
lastException = e;
}
attempt++;
}
throw new RuntimeException("生成失败,已重试" + maxRetries + "次", lastException);
}
/**
* 查询批量任务进度
*/
public BatchJobProgress getProgress(String jobId) {
BatchJob job = jobRepository.findById(jobId)
.orElseThrow(() -> new BusinessException(ErrorCode.JOB_NOT_FOUND, "任务不存在"));
double progressPercent = job.getTotalTasks() > 0 ?
(double) (job.getCompletedTasks() + job.getFailedTasks()) / job.getTotalTasks() * 100 : 0;
// 估算剩余时间(基于已完成速度)
Long estimatedRemainingSeconds = null;
if (job.getStartedAt() != null && job.getCompletedTasks() > 0) {
long elapsedMs = Instant.now().toEpochMilli() - job.getStartedAt().toEpochMilli();
double msPerTask = (double) elapsedMs / job.getCompletedTasks();
int remainingTasks = job.getTotalTasks() - job.getCompletedTasks() - job.getFailedTasks();
estimatedRemainingSeconds = (long) (msPerTask * remainingTasks / 1000);
}
return BatchJobProgress.builder()
.jobId(jobId)
.status(job.getStatus())
.totalTasks(job.getTotalTasks())
.completedTasks(job.getCompletedTasks())
.failedTasks(job.getFailedTasks())
.progressPercent(progressPercent)
.estimatedRemainingSeconds(estimatedRemainingSeconds)
.completedImages(getCompletedImageUrls(job))
.build();
}
private List<String> getCompletedImageUrls(BatchJob job) {
return job.getTasks().stream()
.filter(t -> t.getStatus() == TaskStatus.COMPLETED && t.getImageUrl() != null)
.map(BatchImageTask::getImageUrl)
.collect(Collectors.toList());
}
private void notifyJobCompletion(BatchJob job) {
// 发送邮件或消息通知
log.info("发送任务完成通知,jobId: {}", job.getJobId());
}
}第五章:图片存储到OSS
5.1 OSS存储服务
package com.example.image.service;
import com.aliyun.oss.OSS;
import com.aliyun.oss.model.*;
import org.springframework.stereotype.Service;
import lombok.extern.slf4j.Slf4j;
/**
* 图片OSS存储服务
*/
@Slf4j
@Service
public class ImageStorageService {
private final OSS ossClient;
@Value("${aliyun.oss.bucket-name}")
private String bucketName;
@Value("${image.storage.cdn-domain}")
private String cdnDomain;
@Value("${image.storage.path-prefix:ai-images}")
private String pathPrefix;
public ImageStorageService(OSS ossClient) {
this.ossClient = ossClient;
}
/**
* 存储AI生成的图片到OSS
*
* @param imageData 图片字节数据
* @param format 图片格式(png/jpg/webp)
* @param sourcePrompt 原始提示词(存为元数据,便于检索)
* @return CDN访问URL
*/
public String storeImage(byte[] imageData, String format, String sourcePrompt) {
// 生成存储路径:ai-images/2026/06/29/uuid.png
String date = LocalDate.now().format(DateTimeFormatter.ofPattern("yyyy/MM/dd"));
String objectKey = String.format("%s/%s/%s.%s",
pathPrefix, date, UUID.randomUUID(), format);
// 设置对象元数据
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType(getContentType(format));
metadata.setContentLength(imageData.length);
// 公共读缓存(AI生成图片通常不需要权限控制)
metadata.setCacheControl("public, max-age=31536000"); // 1年
metadata.setContentDisposition("inline");
// 存储提示词到用户自定义元数据(便于后续检索和审计)
if (sourcePrompt != null) {
// OSS自定义元数据限制1KB,超长截断
String truncatedPrompt = sourcePrompt.length() > 900 ?
sourcePrompt.substring(0, 900) : sourcePrompt;
metadata.addUserMetadata("source-prompt", truncatedPrompt);
metadata.addUserMetadata("generated-at", Instant.now().toString());
metadata.addUserMetadata("generated-by", "dall-e-3");
}
// 上传
PutObjectRequest request = new PutObjectRequest(
bucketName, objectKey, new ByteArrayInputStream(imageData), metadata);
// 设置访问权限(公共读)
request.setCannedAcl(CannedAccessControlList.PublicRead);
ossClient.putObject(request);
String url = cdnDomain + "/" + objectKey;
log.info("图片已存储到OSS,大小: {}KB,URL: {}", imageData.length / 1024, url);
// 同时生成缩略图(用于列表展示,减少带宽)
generateThumbnail(imageData, format, objectKey);
return url;
}
/**
* 生成并存储缩略图(400x400)
*/
private void generateThumbnail(byte[] originalData, String format, String originalKey) {
try {
// 使用OSS图片处理功能在线生成缩略图(无需下载+本地处理)
// 格式:原始key + @thumb(通过OSS图片处理规则配置)
String thumbnailKey = originalKey.replace("." + format, "_thumb." + format);
// 使用OSS IMG处理接口缩放
// 实际项目中通过OSS图片处理规则配置,这里简化处理
log.debug("缩略图生成:{}", thumbnailKey);
} catch (Exception e) {
log.warn("缩略图生成失败,不影响主图存储", e);
}
}
/**
* 删除图片(用于清理违规内容)
*/
public void deleteImage(String imageUrl) {
// 从URL提取objectKey
String objectKey = imageUrl.replace(cdnDomain + "/", "");
ossClient.deleteObject(bucketName, objectKey);
log.info("图片已删除: {}", imageUrl);
}
/**
* 图片转WebP格式(更小的文件体积)
* 通过OSS图片处理接口实现,无需本地处理
*/
public String getWebpUrl(String originalUrl) {
return originalUrl + "?x-oss-process=image/format,webp";
}
/**
* 获取指定尺寸的图片URL(OSS图片处理)
*/
public String getResizedUrl(String originalUrl, int width, int height) {
return String.format("%s?x-oss-process=image/resize,m_fill,w_%d,h_%d",
originalUrl, width, height);
}
private String getContentType(String format) {
return switch (format.toLowerCase()) {
case "png" -> "image/png";
case "jpg", "jpeg" -> "image/jpeg";
case "webp" -> "image/webp";
case "gif" -> "image/gif";
default -> "image/png";
};
}
}第六章:ComfyUI API集成——自托管Stable Diffusion
6.1 ComfyUI架构概述
6.2 ComfyUI Java客户端
package com.example.image.comfyui;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.springframework.stereotype.Service;
import lombok.extern.slf4j.Slf4j;
/**
* ComfyUI API客户端
* ComfyUI 使用"工作流JSON"驱动图像生成
*/
@Slf4j
@Service
public class ComfyUIClient {
private final RestTemplate restTemplate;
private final WebSocketClient wsClient;
private final ObjectMapper objectMapper;
@Value("${comfyui.base-url:http://localhost:8188}")
private String baseUrl;
public ComfyUIClient(RestTemplate restTemplate, ObjectMapper objectMapper) {
this.restTemplate = restTemplate;
this.wsClient = new StandardWebSocketClient();
this.objectMapper = objectMapper;
}
/**
* 提交工作流(生成图片)
*
* @param workflow ComfyUI工作流JSON(从ComfyUI界面导出)
* @param clientId 客户端ID(用于WebSocket通知)
* @return promptId(用于查询进度和获取结果)
*/
public String submitWorkflow(Map<String, Object> workflow, String clientId) {
Map<String, Object> requestBody = Map.of(
"prompt", workflow,
"client_id", clientId
);
try {
ResponseEntity<Map> response = restTemplate.postForEntity(
baseUrl + "/prompt", requestBody, Map.class);
if (response.getStatusCode().is2xxSuccessful() && response.getBody() != null) {
String promptId = (String) response.getBody().get("prompt_id");
log.info("ComfyUI工作流提交成功,promptId: {}", promptId);
return promptId;
}
throw new BusinessException(ErrorCode.COMFYUI_SUBMIT_FAILED, "工作流提交失败");
} catch (Exception e) {
log.error("ComfyUI工作流提交异常", e);
throw new BusinessException(ErrorCode.COMFYUI_UNAVAILABLE,
"ComfyUI服务不可用,请检查GPU服务器");
}
}
/**
* 轮询等待生成完成,返回图片数据
* 使用WebSocket监听进度,而非低效轮询
*/
public byte[] waitForResult(String promptId, int timeoutSeconds) throws Exception {
String clientId = UUID.randomUUID().toString();
CountDownLatch latch = new CountDownLatch(1);
AtomicReference<String> resultImageName = new AtomicReference<>();
AtomicReference<Exception> error = new AtomicReference<>();
// WebSocket连接监听生成进度
URI wsUri = URI.create(baseUrl.replace("http", "ws") + "/ws?clientId=" + clientId);
WebSocketSession wsSession = wsClient.execute(
new TextWebSocketHandler() {
@Override
protected void handleTextMessage(WebSocketSession session, TextMessage message) {
try {
Map<String, Object> event = objectMapper.readValue(
message.getPayload(), Map.class);
String type = (String) event.get("type");
if ("executing".equals(type)) {
Map<String, Object> data = (Map) event.get("data");
log.debug("ComfyUI执行中,node: {}", data.get("node"));
} else if ("executed".equals(type)) {
// 某个节点执行完成
Map<String, Object> data = (Map) event.get("data");
if (promptId.equals(data.get("prompt_id"))) {
// 检查是否是输出图片节点
Map<String, Object> output = (Map) data.get("output");
if (output != null && output.containsKey("images")) {
List<Map> images = (List) output.get("images");
if (!images.isEmpty()) {
resultImageName.set((String) images.get(0).get("filename"));
latch.countDown();
}
}
}
}
} catch (Exception e) {
log.error("解析ComfyUI WebSocket消息失败", e);
}
}
},
new WebSocketHttpHeaders(), wsUri
).get();
// 等待生成完成
boolean completed = latch.await(timeoutSeconds, TimeUnit.SECONDS);
wsSession.close();
if (!completed) {
throw new TimeoutException("ComfyUI生成超时(" + timeoutSeconds + "秒)");
}
// 下载生成的图片
String imageName = resultImageName.get();
return downloadImage(imageName);
}
/**
* 从ComfyUI服务器下载生成的图片
*/
private byte[] downloadImage(String filename) {
String url = baseUrl + "/view?filename=" + filename + "&type=output";
ResponseEntity<byte[]> response = restTemplate.getForEntity(url, byte[].class);
if (response.getStatusCode().is2xxSuccessful() && response.getBody() != null) {
log.info("ComfyUI图片下载成功,文件名: {}, 大小: {}KB",
filename, response.getBody().length / 1024);
return response.getBody();
}
throw new BusinessException(ErrorCode.IMAGE_DOWNLOAD_FAILED, "下载生成图片失败");
}
/**
* 构建标准文生图工作流
* 使用SDXL模型
*/
public Map<String, Object> buildTxt2ImgWorkflow(String prompt, String negativePrompt,
int width, int height, int steps) {
// ComfyUI工作流JSON格式
// 每个节点有唯一ID,通过inputs引用其他节点的输出
return Map.of(
"1", Map.of(
"class_type", "CheckpointLoaderSimple",
"inputs", Map.of("ckpt_name", "sd_xl_base_1.0.safetensors")
),
"2", Map.of(
"class_type", "CLIPTextEncode",
"inputs", Map.of(
"text", prompt,
"clip", List.of("1", 1) // 引用节点1的第2个输出(CLIP)
)
),
"3", Map.of(
"class_type", "CLIPTextEncode",
"inputs", Map.of(
"text", negativePrompt,
"clip", List.of("1", 1)
)
),
"4", Map.of(
"class_type", "EmptyLatentImage",
"inputs", Map.of("width", width, "height", height, "batch_size", 1)
),
"5", Map.of(
"class_type", "KSampler",
"inputs", Map.of(
"model", List.of("1", 0),
"positive", List.of("2", 0),
"negative", List.of("3", 0),
"latent_image", List.of("4", 0),
"seed", System.currentTimeMillis(),
"steps", steps,
"cfg", 7.5,
"sampler_name", "euler_ancestral",
"scheduler", "normal",
"denoise", 1.0
)
),
"6", Map.of(
"class_type", "VAEDecode",
"inputs", Map.of(
"samples", List.of("5", 0),
"vae", List.of("1", 2)
)
),
"7", Map.of(
"class_type", "SaveImage",
"inputs", Map.of(
"images", List.of("6", 0),
"filename_prefix", "output"
)
)
);
}
}第七章:图片后处理
7.1 背景去除与格式转换
package com.example.image.service;
import org.springframework.stereotype.Service;
import lombok.extern.slf4j.Slf4j;
/**
* 图片后处理服务
*/
@Slf4j
@Service
public class ImagePostProcessingService {
private final RestTemplate restTemplate;
@Value("${removebg.api-key:}")
private String removeBgApiKey;
/**
* 背景去除(使用 remove.bg API)
* 适用场景:商品主图需要透明背景
*
* 价格:$0.20/张 (基础) 或 $0.05/张 (批量)
*/
public byte[] removeBackground(byte[] imageData) {
// 使用remove.bg API
HttpHeaders headers = new HttpHeaders();
headers.set("X-Api-Key", removeBgApiKey);
headers.setContentType(MediaType.MULTIPART_FORM_DATA);
MultiValueMap<String, Object> body = new LinkedMultiValueMap<>();
body.add("image_file", new ByteArrayResource(imageData) {
@Override
public String getFilename() {
return "image.png";
}
});
body.add("size", "auto");
body.add("type", "product"); // 商品图优化
HttpEntity<MultiValueMap<String, Object>> requestEntity =
new HttpEntity<>(body, headers);
ResponseEntity<byte[]> response = restTemplate.postForEntity(
"https://api.remove.bg/v1.0/removebg",
requestEntity,
byte[].class
);
if (response.getStatusCode().is2xxSuccessful() && response.getBody() != null) {
log.info("背景去除成功,原始: {}KB,处理后: {}KB",
imageData.length / 1024, response.getBody().length / 1024);
return response.getBody();
}
throw new BusinessException(ErrorCode.BACKGROUND_REMOVAL_FAILED, "背景去除失败");
}
/**
* 图片格式转换(PNG → JPEG,减小文件大小)
*/
public byte[] convertToJpeg(byte[] pngData, float quality) {
try {
BufferedImage image = ImageIO.read(new ByteArrayInputStream(pngData));
// 如果有透明通道,先叠加白色背景
BufferedImage jpegReady = new BufferedImage(
image.getWidth(), image.getHeight(), BufferedImage.TYPE_INT_RGB);
jpegReady.createGraphics().drawImage(image, 0, 0, Color.WHITE, null);
ByteArrayOutputStream output = new ByteArrayOutputStream();
ImageWriter writer = ImageIO.getImageWritersByFormatName("jpeg").next();
ImageWriteParam params = writer.getDefaultWriteParam();
params.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
params.setCompressionQuality(quality); // 0.0-1.0
writer.setOutput(ImageIO.createImageOutputStream(output));
writer.write(null, new IIOImage(jpegReady, null, null), params);
writer.dispose();
byte[] jpegData = output.toByteArray();
log.info("PNG→JPEG转换成功,原始: {}KB,压缩后: {}KB,压缩率: {:.1f}%",
pngData.length / 1024, jpegData.length / 1024,
(1 - (double) jpegData.length / pngData.length) * 100);
return jpegData;
} catch (IOException e) {
throw new BusinessException(ErrorCode.IMAGE_CONVERT_FAILED, "图片格式转换失败");
}
}
/**
* 图片压缩优化(适合Web展示)
* 目标:1024x1024的PNG,从5MB压缩到200KB以内
*/
public byte[] optimizeForWeb(byte[] imageData, int maxWidthPx) {
try {
BufferedImage image = ImageIO.read(new ByteArrayInputStream(imageData));
// 按比例缩放
int newWidth = Math.min(image.getWidth(), maxWidthPx);
int newHeight = (int) ((double) image.getHeight() / image.getWidth() * newWidth);
BufferedImage resized = new BufferedImage(newWidth, newHeight, BufferedImage.TYPE_INT_RGB);
resized.createGraphics().drawImage(
image.getScaledInstance(newWidth, newHeight, Image.SCALE_SMOOTH),
0, 0, null
);
// 转换为JPEG并压缩
return convertToJpeg(toByteArray(resized, "PNG"), 0.85f);
} catch (IOException e) {
throw new BusinessException(ErrorCode.IMAGE_OPTIMIZE_FAILED, "图片优化失败");
}
}
private byte[] toByteArray(BufferedImage image, String format) throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ImageIO.write(image, format, baos);
return baos.toByteArray();
}
}第八章:版权与合规
8.1 AI生成图片的商业使用合规检查
8.2 合规管理服务
/**
* AI图片合规管理
*/
@Service
public class ImageComplianceService {
private final ChatClient chatClient;
/**
* Prompt合规性预检查(在调用API前)
* 避免生成违规内容导致账号封禁
*/
public PromptComplianceResult checkPromptCompliance(String prompt) {
// 本地规则快速检查(关键词过滤)
List<String> violations = checkLocalRules(prompt);
if (!violations.isEmpty()) {
return PromptComplianceResult.rejected(
"提示词包含不允许的内容: " + String.join(", ", violations));
}
// AI语义检查(更准确,但有成本)
String checkPrompt = String.format("""
检查以下图片生成提示词是否违反AI图片生成平台的使用政策:
提示词:%s
检查项目:
1. 是否涉及真实人物的肖像权侵犯(描述了特定真实人物)
2. 是否涉及版权品牌商标(如Nike、Apple等商标Logo)
3. 是否包含成人内容
4. 是否涉及暴力血腥内容
5. 是否包含政治敏感内容
返回JSON格式:{"compliant": true/false, "issues": ["问题描述"], "suggestion": "修改建议"}
""", prompt);
String result = chatClient.prompt().user(checkPrompt).call().content();
return parseComplianceResult(result);
}
/**
* 本地关键词规则快速过滤
*/
private List<String> checkLocalRules(String prompt) {
List<String> violations = new ArrayList<>();
String lowerPrompt = prompt.toLowerCase();
// 品牌商标关键词
List<String> brandKeywords = List.of("nike", "apple logo", "coca-cola", "louis vuitton");
for (String brand : brandKeywords) {
if (lowerPrompt.contains(brand)) {
violations.add("可能涉及商标: " + brand);
}
}
// 成人内容关键词(敏感词检查)
// 实际项目应接入专业的内容安全API(如阿里云内容安全)
return violations;
}
/**
* 生成的图片内容安全审核
* 使用阿里云内容安全API
*/
public ImageSafetyResult auditGeneratedImage(byte[] imageData) {
// 实际项目中调用阿里云内容安全/腾讯云天御等服务
// 这里展示接口设计
// 1. 上传图片到内容安全API
// 2. 获取审核结果(合规/疑似违规/违规)
// 3. 违规图片自动删除并记录
log.info("图片安全审核,大小: {}KB", imageData.length / 1024);
// Mock结果
return ImageSafetyResult.builder()
.safe(true)
.confidence(0.98)
.labels(List.of("product", "commercial"))
.build();
}
private PromptComplianceResult parseComplianceResult(String json) {
// 解析AI返回的JSON
// 简化处理
if (json.contains("\"compliant\": true")) {
return PromptComplianceResult.approved();
}
return PromptComplianceResult.rejected("AI检测到潜在问题");
}
}第九章:成本对比与优化
9.1 各方案成本详细对比
真实生产数据(2026年5月某电商平台,月生成10万张图):
| 方案 | 单张成本 | 月成本 | 质量 | 速度 | 推荐场景 |
|---|---|---|---|---|---|
| DALL-E 3 standard | $0.04 | $4,000 | ★★★★☆ | 15s | 普通商品图 |
| DALL-E 3 HD | $0.08 | $8,000 | ★★★★★ | 20s | 精品宣传图 |
| Stability AI SDXL | $0.008 | $800 | ★★★★☆ | 5s | 大批量生成 |
| 自托管ComfyUI(1×A100) | $0.002 | $200 | ★★★★★ | 3-8s | 高频场景 |
| 通义万相(国内合规) | ¥0.14 | ¥14,000 | ★★★★☆ | 10s | 国内合规 |
实际成本优化策略:
/**
* 成本感知的图片生成策略
*/
@Service
public class CostOptimizedImageService {
private final DalleImageGenerationService dalleService;
private final StabilityAiService stabilityService;
private final ComfyUIClient comfyuiClient;
/**
* 智能路由:根据需求选择最经济的方案
*/
public ImageGenerationResult generateWithOptimalCost(ImageGenerationRequest request) {
ImageGenerationStrategy strategy = selectStrategy(request);
log.info("图片生成策略: {}, 预估成本: ${}",
strategy.getName(), strategy.getEstimatedCost());
return switch (strategy.getType()) {
case DALLE3_HD -> dalleService.generate(
request.getPrompt(),
ImageGenerationOptions.builder().highQuality(true).build()
);
case DALLE3_STANDARD -> dalleService.generate(
request.getPrompt(),
ImageGenerationOptions.builder().highQuality(false).build()
);
case STABILITY_AI -> stabilityService.generate(request.getPrompt());
case SELF_HOSTED -> comfyuiService.generate(request.getPrompt());
};
}
private ImageGenerationStrategy selectStrategy(ImageGenerationRequest request) {
// 决策因素:质量需求、使用场景、批量大小
if (request.isHeroImage() || request.isPremiumProduct()) {
// 旗舰商品 → DALL-E 3 HD,不惜成本
return ImageGenerationStrategy.DALLE3_HD;
}
if (request.getBatchSize() > 100) {
// 大批量 → 自托管,成本最低
if (comfyuiService.isAvailable()) {
return ImageGenerationStrategy.SELF_HOSTED;
}
return ImageGenerationStrategy.STABILITY_AI;
}
if (request.isChineseContent()) {
// 中文内容 → 通义万相(中文提示词理解最好)
return ImageGenerationStrategy.WANX;
}
// 默认:DALL-E 3 Standard(性价比均衡)
return ImageGenerationStrategy.DALLE3_STANDARD;
}
}第十章:FAQ
FAQ
Q1:DALL-E生成的图片质量不稳定,有时很好有时很差,怎么解决?
A:影响质量的关键因素:
- 提示词质量最关键(70%的影响):用具体、专业的摄影术语描述
- style参数:vivid(鲜艳夸张)vs natural(真实自然),商品图用natural更合适
- 多生成几张选最好的:DALL-E 3每次生成结果都有随机性,预算允许可以同一个Prompt生成3张选最好的
- 使用HD模式:质量明显更好,成本贵一倍($0.08 vs $0.04)
Q2:如何处理DALL-E拒绝生成的情况(content policy violation)?
A:
- 先用LLM预处理提示词,主动规避违规描述
- 记录被拒绝的Prompt,分析规律
- 准备降级方案:被拒绝时自动切换到Stable Diffusion(审核宽松)
- 对于用户输入的自由文本Prompt,必须做内容审核
Q3:自建ComfyUI划算吗?
A:用这个公式评估:
盈亏平衡点 = GPU服务器月费 / (API单价 - 自建单价)
例:A100服务器月费 $800,DALL-E 3单价$0.04,自建单价$0.002
盈亏平衡点 = $800 / ($0.04 - $0.002) = 21,053 张/月
每月超过2万张,自建就划算了Q4:AI生成的图片可以直接用于商业销售吗?
A:按API归类:
- DALL-E 3:OpenAI将生成内容的所有权转让给用户,可商用
- Stable Diffusion XL:CreativeML许可,可商用
- 通义万相:阿里云用户协议,企业版可商用
- Midjourney:标准版及以上订阅可商用
但有一个共同限制:不能声称这是人类创作,部分国家要求标注"AI生成"。
Q5:批量生成1万张图,怎么保证质量一致性?
A:
- 建立标准Prompt模板库(而不是每张图都重新写Prompt)
- 用LLM自动填充模板变量(只改商品描述,其他参数固定)
- 建立人工抽检机制(每100张抽查5张)
- 对低质量图片自动重新生成(通过图像质量评分API)
总结
AI图像生成已经从"玩具"变成了"生产力工具"。那家月花8万降到3000元的电商公司,不是特例。
核心收益:
- 成本降低90%以上(传统外包 vs API调用)
- 速度提升100倍(天级 → 秒级)
- 规模化能力(传统设计师无法快速批量生成几百张风格一致的图)
技术实现要点:
- DALL-E 3 + Spring AI是最快的集成路径(1天可以上线)
- 提示词工程是核心竞争力,好的模板值钱
- 批量生成 + 并发控制是生产落地的关键
- 自托管ComfyUI在高频场景下能大幅降低成本
- 合规管理不能忽视,商业使用前务必确认版权
