Lycolog

AUTOMATIC1111を起動し、Tagger→Batch from directoryを開く
Additional tagsにキャラクターを出すためのトリガーワードを設定
- 既存のタグと衝突するとぶれるはずなので、一意な名前にする。例えば今井リサならimlsみたいに、とりあえず自分で識別できそうな適当な文字列を当てておく

3. タグファイルを整理する

前項で1画像ファイルに対して1タグファイルができているので、このファイルの内容を整理していく。

まずはゴミワードを探して取り除く。大抵複数ファイルに跨っているので一括置換するのがいい。

次に固定要素を取り除く。例えば髪型や髪の色、瞳の色、アクセサリーの特徴みたいな、何があっても変わらない部分は消す。例えば紫目のキャラクターならpurple eyesを、ヘアアクセサリが特徴的なキャラクターであればhair ornamentを消すといった感じだ。残したい特徴をタグから取り除くことで、その特徴が固定化されやすくなる。

逆に変動させたい部分、逆に服装やポーズ、体の向き、メガネの有無など、変動しうる場所は残しておく。そうすると、いい感じに特徴をとらえたキャラクターが出やすくなる上、プロンプトにそれらを含めなくても出るようになるためトークンの節約にもなる。

但し、この細工は教師画像が少ないとあまり効果がない気がしている。

参考：LoRA／学習方法 - としあきdiffusion Wiki*

4. 学習とモデルの作成

以下のコードをNew Preset.xmloraとか適当な名前で保存し、Kohya_LoRA_param_GUIで読み込み、チェックポイントモデルと教師画像、モデルの出力先を適当に指定して実行するとモデルが生成される。

体感ではponyDiffusionV6XL_v6StartWithThisOne.safetensorsが無難な気がしている。

<?xml version="1.0" encoding="utf-8"?>
<TrainParams xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <ModelPath></ModelPath>
  <TrainImagePath></TrainImagePath>
  <OutputPath></OutputPath>
  <TensorBoardLogPath />
  <LoraModelPath />
  <LearningRate>0.0001</LearningRate>
  <Resolution>1024</Resolution>
  <BatchSize>2</BatchSize>
  <Epochs>8</Epochs>
  <NetworkDim>8</NetworkDim>
  <NetworkAlpha>3</NetworkAlpha>
  <RegImagePath />
  <ShuffleCaptions>true</ShuffleCaptions>
  <KeepTokenCount>0</KeepTokenCount>
  <SaveEveryNEpochs>0</SaveEveryNEpochs>
  <OptimizerType>AdamW8bit</OptimizerType>
  <WarmupSteps>250</WarmupSteps>
  <OutputName></OutputName>
  <Comment />
  <CpuThreads>1</CpuThreads>
  <NoBucketUpscaling>false</NoBucketUpscaling>
  <UseWarmupInit>false</UseWarmupInit>
  <ClipSkip>2</ClipSkip>
  <Seed>42</Seed>
  <SavePrecision>fp16</SavePrecision>
  <SchedulerType>cosine_with_restarts</SchedulerType>
  <MinBucketResolution>320</MinBucketResolution>
  <MaxBucketResolution>1536</MaxBucketResolution>
  <CaptionFileExtension>.txt</CaptionFileExtension>
  <VAEPath />
  <UnetLR>-1</UnetLR>
  <TextEncoderLR>-1</TextEncoderLR>
  <NoiseOffset>0</NoiseOffset>
  <Momentum>0.9</Momentum>
  <advancedTrainType>UNetOnly</advancedTrainType>
  <CrossAttenType>xformers</CrossAttenType>
  <UseGradient>true</UseGradient>
  <UseWeightedCaptions>false</UseWeightedCaptions>
  <AdaptiveNoiseScale>0</AdaptiveNoiseScale>
  <MinSNRGamma>0</MinSNRGamma>
  <MultiresNoiseIterations>0</MultiresNoiseIterations>
  <MultiresNoiseDiscount>0</MultiresNoiseDiscount>
  <NetworkDropout>0</NetworkDropout>
  <RankDropout>0</RankDropout>
  <ModuleDropout>0</ModuleDropout>
  <MaxNormReg>0</MaxNormReg>
  <CaptionDropout>0</CaptionDropout>
  <IpNoiseGamma>0</IpNoiseGamma>
  <ModuleType>LoRA</ModuleType>
  <AlgoType>lora</AlgoType>
  <ConvDim>4</ConvDim>
  <ConvAlpha>1</ConvAlpha>
  <UseConv2dExtend>true</UseConv2dExtend>
  <DyLoRAUnit>4</DyLoRAUnit>
  <DatasetConfigPath />
  <TrainNorm>false</TrainNorm>
  <RescaledOFT>false</RescaledOFT>
  <ConstrainedOFT>false</ConstrainedOFT>
  <UseScalar>false</UseScalar>
  <UseTucker>false</UseTucker>
  <WeightDocomposition>false</WeightDocomposition>
  <UseBlockWeight>false</UseBlockWeight>
  <BlockWeightIn>
    <int>20</int>
    <int>20</int>
    <int>20</int>
    <int>20</int>
    <int>20</int>
    <int>0</int>
    <int>0</int>
    <int>0</int>
    <int>0</int>
    <int>0</int>
    <int>0</int>
    <int>0</int>
  </BlockWeightIn>
  <BlockWeightMid>0</BlockWeightMid>
  <BlockWeightMid01>20</BlockWeightMid01>
  <BlockWeightMid02>20</BlockWeightMid02>
  <BlockWeightOut>
    <int>0</int>
    <int>0</int>
    <int>0</int>
    <int>0</int>
    <int>0</int>
    <int>0</int>
    <int>0</int>
    <int>20</int>
    <int>20</int>
    <int>20</int>
    <int>20</int>
    <int>20</int>
  </BlockWeightOut>
  <BlockWeightOffsetIn>0</BlockWeightOffsetIn>
  <BlockWeightOffsetOut>0</BlockWeightOffsetOut>
  <BlockWeightPresetTypeIn>none</BlockWeightPresetTypeIn>
  <BlockWeightPresetTypeOut>none</BlockWeightPresetTypeOut>
  <BlockWeightZeroThreshold>0</BlockWeightZeroThreshold>
  <UseBlockDim>false</UseBlockDim>
  <BlockDimIn>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
  </BlockDimIn>
  <BlockDimMid>32</BlockDimMid>
  <BlockDimMid01>4</BlockDimMid01>
  <BlockDimMid02>4</BlockDimMid02>
  <BlockDimBase>4</BlockDimBase>
  <BlockDimOutSDXL>4</BlockDimOutSDXL>
  <BlockDimOut>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
    <int>64</int>
  </BlockDimOut>
  <BlockAlphaIn />
  <BlockAlphaMid>-1</BlockAlphaMid>
  <BlockAlphaOut />
  <BlockAlphaInM>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
  </BlockAlphaInM>
  <BlockAlphaMidM>32</BlockAlphaMidM>
  <BlockAlphaMid01>4</BlockAlphaMid01>
  <BlockAlphaMid02>4</BlockAlphaMid02>
  <BlockAlphaBase>4</BlockAlphaBase>
  <BlockAlphaOutSDXL>4</BlockAlphaOutSDXL>
  <BlockAlphaOutM>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
    <decimal>64</decimal>
  </BlockAlphaOutM>
  <UseColorAug>false</UseColorAug>
  <UseFastLoading>true</UseFastLoading>
  <DontSaveMetadata>false</DontSaveMetadata>
  <UseFlipAug>false</UseFlipAug>
  <CropRandomly>false</CropRandomly>
  <CacheLatents>true</CacheLatents>
  <CacheLatentsToDisk>true</CacheLatentsToDisk>
  <HighVRAM>false</HighVRAM>
  <UseAdditionalOptArgs>false</UseAdditionalOptArgs>
  <LRSchedulerCycle>4</LRSchedulerCycle>
  <GradAccSteps>1</GradAccSteps>
  <DataLoaderThreads>1</DataLoaderThreads>
  <MaxTokens>75</MaxTokens>
  <mixedPrecisionType>fp16</mixedPrecisionType>
  <WeightDecay>0</WeightDecay>
  <Eps>1E-06</Eps>
  <D0>1E-06</D0>
  <GrowthRate>0</GrowthRate>
  <Betas0>0.9</Betas0>
  <Betas1>0.999</Betas1>
  <Betas2>0.999</Betas2>
  <DAdaptMomentum>0.9</DAdaptMomentum>
  <ProdigyBeta3>0</ProdigyBeta3>
  <DCoef>1</DCoef>
  <Decouple>false</Decouple>
  <NoProx>false</NoProx>
  <SafeguardWarmup>false</SafeguardWarmup>
  <UseBiasCorrection>false</UseBiasCorrection>
  <StableDiffusionType>XL</StableDiffusionType>
  <NoHalfVAE>false</NoHalfVAE>
  <CacheTextencoder>false</CacheTextencoder>
  <CacheTextencoderToDisk>false</CacheTextencoderToDisk>
  <IsEpoch>true</IsEpoch>
  <UseFullFP16>false</UseFullFP16>
  <UseFP8Base>false</UseFP8Base>
  <RelativeStep>true</RelativeStep>
  <ScaleParameter>true</ScaleParameter>
  <SaveState>false</SaveState>
  <MaskLoss>false</MaskLoss>
  <RandomNoiseOffset>false</RandomNoiseOffset>
  <RandomIpNoiseGamma>false</RandomIpNoiseGamma>
  <TokensSeparator />
  <LossType>LTwo</LossType>
  <HuberScheduleType>SNR</HuberScheduleType>
  <HuberC>0.1</HuberC>
  <LoRAPlusLRRatio>0</LoRAPlusLRRatio>
  <LoRAPlusUnetLRRatio>0</LoRAPlusUnetLRRatio>
  <LoRAPlusTELRRatio>0</LoRAPlusTELRRatio>
</TrainParams>

コメント（0件）

SDベンチを回してみた

投稿日：: 2024/05/28

自作PC::ベンチマークソフトウェア::Stable Diffusion

ちもろぐの【Stable Diffusion】AIイラストにおすすめなグラボをガチで検証【GPU別の生成速度】のベンチマークを個人環境で試してみた。

設定は「1024×1024：トキ（ネイティブ高解像度イラスト）」がベース。

検証機

構成	製品	備考
OS	Windows 11 Pro	Version 22621.3593
CPU	Intel Core i7 13700	16C24T, 2.1 GHz, TDP 65W
GPU	GeForce RTX 4070 Ti	Driver: NVIDIA 555.85
MEM	Crucial Ballistix BL2K16G32C16U4B	DDR4-3200 16GB * 4
M/B	ASUS TUF GAMING Z790-PLUS D4	ATX, Z790
CPU Cooler	Noctua NH-U12A
CPU Fan	Fractal Design Prisma AL-12 PWM	120mm PWM
SSD	Solidigm P44 Pro SSDPFKKW020X7X1	NVMe SSD 2TB

共通設定

Stable Diffusionの設定

全てのケースで以下の設定を利用（モデルは後述）

設定	値
Clip skip	2
ENSD	31337
Propmpt	`1girl, toki \(blue archive\), blue archive, toki sits cross-legged in her chair. looking at viewer, cowboy shot, masterpiece, best quality, newest,`
Negative Prompt	`nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name,`
VAE	Automatic
Sampleing method	Euler a
Sampleing steps	15
Width	1024px
Height	1024px
Batch count	5
Batch size	1
CFG Scale	7
Seed	50

NVIDIA Control Panelの設定

設定項目	設定値
CUDA - システムメモリフォールバックポリシー	システムメモリフォールバックなしを優先
電源管理モード	パフォーマンス最大化を優先

Windows Defenderのリアルタイム保護

特記がない限り有効化している。

ベンチマーク結果

Animagine XL 3.1 --xformers --opt-channelslast

起動オプション

set COMMANDLINE_ARGS=--xformers --opt-channelslast

使用モデル

animagine-xl-3.1.safetensors [e3c47aedb0]

スコア

処理時間	25sec
平均生成速度	3.64it/s

15/15 [00:04<00:00,  3.65it/s]
15/15 [00:03<00:00,  3.78it/s]
15/15 [00:03<00:00,  3.77it/s]
15/15 [00:03<00:00,  3.77it/s]
15/15 [00:03<00:00,  3.77it/s]
75/75 [00:25<00:00,  2.98it/s]
75/75 [00:25<00:00,  3.75it/s]

Animagine XL 3.1 --xformers

起動オプション

set COMMANDLINE_ARGS=--xformers

使用モデル

animagine-xl-3.1.safetensors [e3c47aedb0]

スコア

処理時間	27sec
平均生成速度	3.29it/s

15/15 [00:04<00:00,  3.64it/s]
15/15 [00:05<00:00,  2.60it/s]
15/15 [00:04<00:00,  3.44it/s]
15/15 [00:04<00:00,  3.42it/s]
15/15 [00:04<00:00,  3.40it/s]
75/75 [00:27<00:00,  2.73it/s]
75/75 [00:27<00:00,  3.78it/s]

Animagine XL 3.1 起動オプションなし

起動オプション

なし

使用モデル

animagine-xl-3.1.safetensors [e3c47aedb0]

スコア

処理時間	32sec
平均生成速度	2.64it/s

15/15 [00:05<00:00,  2.60it/s]
15/15 [00:05<00:00,  2.73it/s]
15/15 [00:05<00:00,  2.73it/s]
15/15 [00:05<00:00,  2.70it/s]
15/15 [00:05<00:00,  2.66it/s]
75/75 [00:32<00:00,  2.32it/s]
75/75 [00:32<00:00,  2.74it/s]

Animagine XL 3.0 --xformers --opt-channelslast

起動オプション

set COMMANDLINE_ARGS=--xformers --opt-channelslast

使用モデル

animagineXLV3_v30.safetensors [e3c47aedb0]

スコア

処理時間	25sec
平均生成速度	3.61it/s

15/15 [00:04<00:00,  3.59it/s]
15/15 [00:03<00:00,  3.75it/s]
15/15 [00:03<00:00,  3.77it/s]
15/15 [00:03<00:00,  3.76it/s]
15/15 [00:04<00:00,  3.74it/s]
75/75 [00:25<00:00,  2.95it/s]
75/75 [00:25<00:00,  3.70it/s]

Animagine XL 3.0 --xformers --opt-channelslast --medvram-sdxl Windows Defender「リアルタイム保護」を無効化

起動オプション

Windows Defender「リアルタイム保護」を無効化

set COMMANDLINE_ARGS=--xformers --opt-channelslast --medvram-sdxl
set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.6, max_split_size_mb:128

使用モデル

animagineXLV3_v30.safetensors [e3c47aedb0]

スコア

処理時間	45sec
平均生成速度	2.99it/s

15/15 [00:05<00:00,  2.60it/s]
15/15 [00:04<00:00,  3.27it/s]
15/15 [00:04<00:00,  3.25it/s]
15/15 [00:04<00:00,  3.27it/s]
15/15 [00:04<00:00,  3.24it/s]
75/75 [00:45<00:00,  1.67it/s]
75/75 [00:45<00:00,  3.66it/s]

Animagine XL 3.0 --xformers --opt-channelslast --medvram-sdxl

起動オプション

set COMMANDLINE_ARGS=--xformers --opt-channelslast --medvram-sdxl

使用モデル

animagineXLV3_v30.safetensors [e3c47aedb0]

スコア

処理時間	45sec
平均生成速度	2.92it/s

15/15 [00:07<00:00,  2.11it/s]
15/15 [00:04<00:00,  3.25it/s]
15/15 [00:04<00:00,  3.24it/s]
15/15 [00:04<00:00,  3.26it/s]
15/15 [00:04<00:00,  3.27it/s]
75/75 [00:45<00:00,  1.64it/s]
75/75 [00:45<00:00,  3.68it/s]

おまけ1 --xformers

このベンチマークはグラフィックドライバのバージョンが異なる。バージョンは不明だが531.79以前と思われる。

NVIDIA Control Panelの設定

設定項目	設定値
CUDA - システムメモリフォールバックポリシー	設定項目がない
電源管理モード	パフォーマンス最大化を優先

起動オプション

set COMMANDLINE_ARGS=--xformers

使用モデル

animagine-xl-3.1.safetensors [e3c47aedb0]

スコア

処理時間	73sec
平均生成速度	1.14it/s

15/15 [00:12<00:00,  1.23it/s]
15/15 [00:12<00:00,  1.16it/s]
15/15 [00:13<00:00,  1.11it/s]
15/15 [00:13<00:00,  1.13it/s]
15/15 [00:13<00:00,  1.12it/s]
75/75 [01:13<00:00,  1.02it/s]
75/75 [01:13<00:00,  1.22it/s]

おまけ2

このベンチマークはグラフィックドライバのバージョンが異なる。バージョンは不明だが531.79以前と思われる。

NVIDIA Control Panelの設定

設定項目	設定値
CUDA - システムメモリフォールバックポリシー	設定項目がない
電源管理モード	パフォーマンス最大化を優先

起動オプション

なし

使用モデル

animagine-xl-3.1.safetensors [e3c47aedb0]

スコア

処理時間	75sec
平均生成速度	1.08it/s

15/15 [00:14<00:00,  1.04it/s]
15/15 [00:13<00:00,  1.10it/s]
15/15 [00:13<00:00,  1.09it/s]
15/15 [00:13<00:00,  1.11it/s]
15/15 [00:13<00:00,  1.12it/s]
75/75 [01:15<00:00,  1.00s/it]
75/75 [01:15<00:00,  1.12it/s]

まとめ

設定系のパフォーマンス影響など

Windows Defenderのリアルタイム保護は生成速度に有意な影響がないと思われる
Animagine XL 3.1とAnimagine XL 3.0での生成速度には特に差がないと思われる
--medvram-sdxlを付けると遅くなる
set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.6, max_split_size_mb:128では特に意味がなさそうに見えた
--xformersを付けると早くなる
--opt-channelslastを付けると少しだけ早くなる
NVIDIAのドライバのバージョンを上げると劇的に早くなる
ちもろぐSDベンチでは4070Tiでの「1024×1024：トキ（ネイティブ高解像度イラスト）」の生成スコアは3.15it/sだったが、私の環境では平均3.64it/s、最高3.75it/s、最低2.98it/sであった
- ちもろぐベンチのスコアが何処をとっているのかは不明だが、ちもろぐ側はCore i9 13900Kを利用しており、こちらのCore i7 13700であるため、CPUの割には悪くないスコアが出たと思う
元記事を読む限り、ちもろぐ先生も結構ガッツリとNSFW生成をされていそうで安堵した（？

生成した画像

品質系プロンプトやHires. fixが入ってないので品質が微妙だが一応載せておく。

おまけ

品質系プロンプトやHires. fixを入れた版。ディティールや体の作りが、だいぶ良くなっている。

おまけ

ベンチとは一切関係なくなってしまうが、ついでにいつものキャラも出してみる。うちの子は元から可愛いのだが、SDXLの力で随分洗練され可愛くなった。

プロンプトなど

short hairではなくshot hairになっていることに、このプロンプトを生み出して一年半くらい経つ、今更気が付いた。

設定	値
Clip skip	2
ENSD	31337
Propmpt	`(illustration:1.0), masterpiece, best quality, 1girl, solo, happy, smile, theater, (perspective:1.3), from below, (looking away:1.2), (from side:1.0), {{shot_hair}}, smile, bangs, shaggy, (brown hair:1.1), swept_bangs, thick_eyebrows, skin_fang, closed mouth, {{purple eyes}}, gray {{jacket}}, white shirt, glasses, {{small breasts}},`
Negative Prompt	`nsfw, (worst quality, low quality:1.4), (depth of field, blurry, bokeh:1.5), (greyscale, monochrome:1.0), multiple views, text, title, logo, signature, (tooth, lip, nose, 3d, realistic:1.0), dutch angle,(cropped:1.4), text, title, signature, logo, (loli:1.2), school satchel, pink, school bag, school uniform, from behind`
Model	animagine-xl-3.1.safetensors [e3c47aedb0]
VAE	Automatic
Sampleing method	Euler a
Hires. fix	Upscaler: Latent, Hires steps: 0, Denosing strength: 0.7, upscale by: 2
Sampleing steps	20
Width	768px
Height	768px
Batch count	6
Batch size	1
CFG Scale	7
Seed	-1

おまけのベンチスコア

参考までに上記設定でBatch countを6に変えた時の生成速度も貼っておく。なおpromptのshot hairはshort hairに直している。

処理時間	118sec
平均生成速度	3.09it/s

20/20 [00:04<00:00,  4.45it/s]
20/20 [00:12<00:00,  1.55it/s]
20/20 [00:04<00:00,  4.94it/s]
20/20 [00:12<00:00,  1.58it/s]
20/20 [00:03<00:00,  5.41it/s]
20/20 [00:12<00:00,  1.59it/s]
20/20 [00:03<00:00,  5.01it/s]
20/20 [00:12<00:00,  1.59it/s]
20/20 [00:03<00:00,  5.08it/s]
20/20 [00:12<00:00,  1.59it/s]
20/20 [00:03<00:00,  5.21it/s]
20/20 [00:12<00:00,  1.59it/s]
240/240 [01:53<00:00,  2.11it/s]
240/240 [01:53<00:00,  1.59it/s]

そしてこれは計測用の走行で出た子

コメント（0件）

Windows 11にAUTOMATIC1111を導入する方法

投稿日：: 2023/05/06

ソフトウェア::Stable Diffusionジャンル::セットアップ技術::AI

事前準備

事前に必要なコンポーネントをインストールしておく

セットアップコマンド

MSYS2などのPOSIX互換レイヤー上のシェルで動かすことを想定

# get Stable Diffusion web UI
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git

# get extentions
git clone https://github.com/nolanaatama/sd-webui-tunnels stable-diffusion-webui/extensions/sd-webui-tunnels
git clone https://github.com/Mikubill/sd-webui-controlnet stable-diffusion-webui/extensions/sd-webui-controlnet
git clone https://github.com/fkunn1326/openpose-editor stable-diffusion-webui/extensions/openpose-editor
git clone https://github.com/yfszzx/stable-diffusion-webui-images-browser stable-diffusion-webui/extensions/stable-diffusion-webui-images-browser
git clone https://github.com/DominikDoom/a1111-sd-webui-tagcomplete stable-diffusion-webui/extensions/a1111-sd-webui-tagcomplete
git clone https://github.com/Bing-su/dddetailer stable-diffusion-webui/extensions/dddetailer
git clone https://github.com/mcmonkeyprojects/sd-dynamic-thresholding stable-diffusion-webui/extensions/d-dynamic-thresholding

# make resource dirs
mkdir -p stable-diffusion-webui/models/ESRGAN/
mkdir -p stable-diffusion-webui/models/Lora/
mkdir -p stable-diffusion-webui/models/VAE/
mkdir -p stable-diffusion-webui/models/hypernetworks/
mkdir -p stable-diffusion-webui/extensions/sd-webui-controlnet/models/

# get controlnet
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_canny_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_canny_sd14v1.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_color_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_color_sd14v1.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_depth_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_depth_sd14v1.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_keypose_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_keypose_sd14v1.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_openpose_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_openpose_sd14v1.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_seg_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_seg_sd14v1.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_sketch_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_sketch_sd14v1.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_style_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_style_sd14v1.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11e_sd15_ip2p.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11e_sd15_ip2p.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11e_sd15_shuffle.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11e_sd15_shuffle.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11f1e_sd15_tile.pth.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11f1e_sd15_tile.pth.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11f1p_sd15_depth.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11f1p_sd15_depth.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11p_sd15_canny.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_canny.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11p_sd15_inpaint.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_inpaint.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11p_sd15_lineart.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_lineart.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11p_sd15_mlsd.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_mlsd.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11p_sd15_normalbae.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_normalbae.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11p_sd15_openpose.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_openpose.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11p_sd15_scribble.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_scribble.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11p_sd15_seg.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_seg.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11p_sd15_softedge.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_softedge.pth
curl -Lo stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_v11p_sd15s2_lineart_anime.pth https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15s2_lineart_anime.pth

# get embeddings
curl -Lo stable-diffusion-webui/embeddings/badhandv4.pt https://civitai.com/api/download/models/20068
curl -Lo stable-diffusion-webui/embeddings/EasyNegative.pt https://huggingface.co/datasets/gsdf/EasyNegative/resolve/main/EasyNegative.pt

# get model
# upscaler
curl -Lo stable-diffusion-webui/models/ESRGAN/4x-UltraSharp.pth https://huggingface.co/nolanaatama/ESRGAN/resolve/main/4x-UltraSharp.pth
curl -Lo stable-diffusion-webui/models/ESRGAN/TGHQFace8x_500k.pth https://huggingface.co/dwnmf/deliberatev2/resolve/main/TGHQFace8x_500k.pth

# model
curl -Lo stable-diffusion-webui/models/Stable-diffusion/AOM3A1B_orangemixs.safetensors https://huggingface.co/WarriorMama777/OrangeMixs/resolve/main/Models/AbyssOrangeMix3/AOM3A1B_orangemixs.safetensors

# vae
curl -Lo stable-diffusion-webui/models/VAE/orangemix.vae.pt https://huggingface.co/WarriorMama777/OrangeMixs/resolve/main/VAEs/orangemix.vae.pt
curl -Lo stable-diffusion-webui/models/VAE/kl-f8-anime2.ckpt https://huggingface.co/hakurei/waifu-diffusion-v1-4/resolve/main/vae/kl-f8-anime2.ckpt

起動コマンド

dddetailerの依存ライブラリインストールで初回起動はかなり時間が掛かるので気長に待つ。

./webui-user.bat

設定

Settings -> User Interface
Quicksettings listに以下を追加
- , sd_vae, CLIP_stop_at_last_layers

コメント（0件）

Colab ProとRTX4070TiでStableDiffusionをベンチマークしてみた

投稿日：: 2023/05/06

ソフトウェア::Stable Diffusion技術::AI

Colab Proとローカルマシンでどのくらい差が出るか簡単にベンチマークしてみたのでその結果です。

レンダリング条件

今回は以下の設定で回した結果を比較します。前回と同じです。

ColabProでのNotebookは以下を使いました。

https://gist.github.com/Lycolia/cb432ad1b1ce083482b5487c131b5d12/80a059931c538b10d55cf9fcbf82220f24e64653

設定値は以下の通りです。

設定	値
Propmpt	`(illustration:1.0), masterpiece, best quality, 1girl, solo, happy, smile, theater, (perspective:1.3), from below, (looking away:1.2), (from side:1.0), {{shot_hair}}, smile, bangs, shaggy, (brown hair:1.1), swept_bangs, thick_eyebrows, skin_fang, closed mouth, {{purple eyes}}, gray {{jacket}}, white shirt, glasses, {{small breasts}},`
Negative Prompt	`nsfw, (worst quality, low quality:1.4), (depth of field, blurry, bokeh:1.5), (greyscale, monochrome:1.0), multiple views, text, title, logo, signature, (tooth, lip, nose, 3d, realistic:1.0), dutch angle,(cropped:1.4), text, title, signature, logo, (loli:1.2), school satchel, pink, school bag, school uniform, from behind`
Model	AOM3A1B
VAE	orangemix.vae.pt
Sampleing method	Euler a
Sampleing steps	20
Width	512px
Height	512px
Batch count	1
Batch size	1
CFG Scale	7
Seed	-1

ローカルマシンのスペック

Local 1

機材	内容	備考
OS	Windows 11 Pro
M/B	ASUS ROG STRIX Z390-F GAMING	PCIe 3.0
CPU	Intel Core i9-9900
GPU	GeForce RTX 4070 Ti
MEM	DDR4-3200 16GB * 4

Local 2

PCIeのバージョンを上げるとベンチスコアが伸びると聞いたので試してみた

機材	内容	備考
OS	Windows 11 Pro
M/B	ASUS TUF GAMING Z790-PLUS D4	PCIe 5.0
CPU	Intel Core i7 13700
GPU	GeForce RTX 4070 Ti
MEM	DDR4-3200 16GB * 4

ベンチマーク結果

Colabは時間帯によって処理時間が変わるので2点計測しています。

4070 TiがColabProに大きく勝る結果となりました。UIの安定性や応答性でも体感ColabProを大きく上回り、ローカルストレージに出力結果や環境を溜め込み続けられる利便性があったり、起動速度に雲泥の差があったり、設定を記憶できたり、ローカルで動かせるならそれが一番楽だと感じました。

グラボはZOTAC GAMING GeForce RTX 4070 Ti Trinity OCを使っていますが、発熱も60度くらいに収まるのでぼちぼち悪くないかなと思ってます。

おまけ

今回もベンチマーク中に生成された画像を何枚かピックアップして載せておきます

おまけ2

ローカルだと気楽に作れるのでサッと作ってみた高品質版

設定値は以下の通りです。

設定	値
Propmpt	`(masterpiece, sidelighting, finely detailed beautiful eyes: 1.2), masterpiece*portrait, realistic, 3d face, glowing eyes, shiny hair, lustrous skin, (brown hair:1.1), short hair, smile, 1girl, embarassed, small breasts, theater, thick eyebrows, closed mouth, {{purple eyes}}, gray {{jacket}}, {{{{white shirt}}}}, glasses, {{small breasts}`
Negative Prompt	`{{{{{{{nsfw}}}}}}, (worst quality, low quality:1.4), (depth of field, blurry, bokeh:1.5), (greyscale, monochrome:1.0), multiple views, text, title, logo, signature, (tooth, lip, nose, 3d, realistic:1.0), dutch angle,(cropped:1.4), text, title, signature, logo, (loli:1.2), school satchel, pink, school bag, school uniform, from behind`
Model	AOM3A1B
VAE	orangemix.vae.pt
Sampleing steps	32
Sampleing method	DPM++ SDE Karras
CFG Scale	7
Width	512px
Height	756px
Clip skip	2
Seed	-1

コメント（0件）

Google Colaboratoryの無料枠と有料枠でStable Diffusionの動作速度を比較してみた

投稿日：: 2023/04/11

ソフトウェア::Stable Diffusion技術::AIWebサービス::その他

2023-05-04現在、無料枠では利用できない可能性があります。

Google ColaboratoryでStable Diffusionを動かす場合、無料と有料でどの程度変わるのか試してみたのでその結果。ざっくり4倍ほど変わるのかなというのが体感です。

レンダリング条件

今回は以下の設定で回した結果を比較します。

ベースとなるNotebookは以下を使っています。

https://gist.github.com/Lycolia/cb432ad1b1ce083482b5487c131b5d12/80a059931c538b10d55cf9fcbf82220f24e64653

設定値は以下です。ほぼデフォです。

設定	値
Propmpt	`(illustration:1.0), masterpiece, best quality, 1girl, solo, happy, smile, theater, (perspective:1.3), from below, (looking away:1.2), (from side:1.0), {{shot_hair}}, smile, bangs, shaggy, (brown hair:1.1), swept_bangs, thick_eyebrows, skin_fang, closed mouth, {{purple eyes}}, gray {{jacket}}, white shirt, glasses, {{small breasts}},`
Negative Prompt	`nsfw, (worst quality, low quality:1.4), (depth of field, blurry, bokeh:1.5), (greyscale, monochrome:1.0), multiple views, text, title, logo, signature, (tooth, lip, nose, 3d, realistic:1.0), dutch angle,(cropped:1.4), text, title, signature, logo, (loli:1.2), school satchel, pink, school bag, school uniform, from behind`
Model	AOM3A1B
VAE	orangemix.vae.pt
Sampleing method	Euler a
Sampleing steps	20
Width	512px
Height	512px
Batch count	1
Batch size	1
CFG Scale	7
Seed	-1

比較結果

レンダリング時間はTotal progressの時間を書いてます。

\	Colab無料枠	Colab Pro
GPUクラス	標準	プレミアム
メモリ	標準	ハイメモリ
GPU	Tesla T4	A100
システムRAM	12.7 GB	83.5 GB
GPU RAM	15.0 GB	40.0GB
ディスク	166.8 GB	166.8 GB
レンダリング時間	8秒	2秒

セットアップ時間はどちらも5分ほど。

スペック情報参考

無料枠

Golab Proで標準GPUにしても同じです。

Tue Apr 11 12:50:30 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   70C    P8    33W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
              total        used        free      shared  buff/cache   available
Mem:           12Gi       628Mi       7.4Gi       5.0Mi       4.6Gi        11Gi
Swap:            0B          0B          0B

Colab Pro

Pay As You Goで有料枠買ってもGPUは同じです。

Tue Apr 11 12:55:03 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-SXM...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   39C    P0    46W / 400W |      0MiB / 40960MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
              total        used        free      shared  buff/cache   available
Mem:           83Gi       756Mi        79Gi       1.0Mi       3.3Gi        82Gi
Swap:            0B          0B          0B