March 21st, Tuesday, is a national holiday called "Spring Equinox Day." This day is one of the 24 solar terms, and it's a day when the length of day and night are almost the same.
Today, I woke up as usual, a little after 5 am, and have been practicing English pronunciation since this morning. I'm using the smartphone app "ELSA SPEAKING," which uses AI to accurately judge pronunciation, so I can correct my pronunciation on my own.
By pronouncing English correctly, it becomes easier to understand the English spoken by native speakers. I am using this method because it is said that training your speaking skills is effective in improving your listening skills.
One of the issues with Japanese people's English is their poor pronunciation. This is because Japanese pronunciation and English pronunciation are completely different, and speaking skills are not given much importance in school education.
I think that by training speaking and listening skills, it's possible to learn practical English more effectively.
Now, let's talk about the topic of AI image generation. Recently, I have been very interested in AI image generation and have tried various methods. However, some degree of knowledge is required to understand them, and sometimes it can be difficult to understand even when researching on the internet.
However, the other day, I finally succeeded in setting up the environment of a popular image-generating AI called "Stable Diffusion" on my computer. The name comes from the mathematical model of diffusion theory.
GUI of Stable Diffusion |
To set up this environment, it is necessary to install a program environment called Python and an app called GIT, and run a batch file. Since the batch file runs on the DOS prompt, the progress is displayed in text format.
I didn't know the conditions for completing the installation, so I didn't know if the work was finished or not. Therefore, I misunderstood that the work had frozen and had to redo it many times. Also, it took me half a day to research how to solve the problem.
However, through trial and error, I gained knowledge and was finally able to generate images. I am now very satisfied.
By the way, I found a carefully explained video for beginners and would like to introduce it.
日本一わかりやすいStableDiffusion WebUI AUTOMATIC1111(ローカル版)のインストール方法と基本的な使い方
I tried creating an image using "Stable Diffusion" and I am amazed at how wonderful the outcome is. Although I am an amateur, I think the images created by this AI are of such high quality that it is difficult to distinguish whether they were created by humans or not.
I had already been using the paid web service "FOTOR" to generate images. With FOTOR, anyone can easily create high-quality images. However, I was dissatisfied with the limited types of image models that could be generated and the long time it took to create images. Although there are ways to shorten the time, they require payment, so the service was not very user-friendly.
An image generated using FOTOR." |
On the other hand, I felt that "Stable Diffusion" had a slightly higher hurdle for introduction, but once introduced, it can be used for free and images can be generated in a short amount of time. I have created many images so far. Thanks to the advancement of AI, it is amazing that high-quality images can be easily created even if one cannot draw. This has brought about a very significant change in my life.
There are two ways to generate images using AI. The first one is to provide textual information to the AI, which is referred to as a "prompt". The second method is to provide image data.
generated by Stable Diffusion |
Typically, the most common method for generating images seen on the web is to use prompts. There are rules to follow when writing a prompt, but you don't need complex programming knowledge. You just need to describe the image you want to generate using words or sentences. For example, using words like "girl" or "smile" would be sufficient.
The generated image varies depending on how you write the prompt, so creating an effective prompt is very important.
I am still not very good at creating prompts, but I plan to learn how to write more efficient prompts in order to generate images more efficiently.
One easy way to create prompts is to have ChatGPT create them for you. Let me explain how I generated images of 2B from "Nier Automata" and Asuna from "Sword Art Online" using this method. I asked ChatGPT to describe the characters' physical features, and then I requested an English prompt based on the results. This method is very convenient.
Sample 1: "2B" from "Nier Automata"
Here is a sample prompt created using ChatGPT.
masterpiece, (photorealistic:1.4), best quality, beautiful lighting, yorha no. 2 type b, 1girl, slit skirt, black blindfold, black dress, black hairband, blindfold, blue sky, boots, building, city, cloud, covered eyes, debris, dress, feather-trimmed sleeves, feather trim, from below, gloves, grass, hairband, high heel boots, high heels, highres, juliet sleeves, katana, leather, leather boots, long sleeves, nier \(series\), nier automata, outdoors, overgrown, post-apocalypse, puddle, puffy sleeves, rubble, ruins, scenery, sky, thigh boots, thighhighs, thighhighs under boots, water <lora:yorha_noDOT_2_type_b:0.65> RAW photo, 8k uhd, film grain, schoolgilr,
Sample 1: "ASUNA" from "Sword Art Online"
Here is a sample prompt created using ChatGPT.
Asuna is one of the heroines in the light novel "Sword Art Online" and its adaptations in anime and video games, and she has the following physical characteristics: Her hair is long and black, tied back behind her head, and flows smoothly, emphasizing her beauty. Her eyes are large and deep brown, with slightly thick eyeliner that enhances her beauty even more. Her face has a very well-proportioned small face, giving off a beautiful impression. Her nose is small and high, with an attractive shape. In addition, her lips are very charming and have a light pink color. In "Sword Art Online," the game where she first appears, she wears a white and blue lace outfit. Later, she wears various costumes while remaining in the "Aincrad" world, wearing a simple blue outfit with boots in the game. In the sequel, "ALfheim Online," she is known as the "Knight of the Flowers" and wears an outfit inspired by a white princess dress, with a white skirt and top and frilly accessories., schoolgilr,
The depiction of 2B is almost perfect. I think it's because the model used has information about 2B. A model is the "style" used to generate images, and there are various models available, such as anime, comics, and 3D.
The depiction of Asuna is also very close to the image, and I am amazed at the progress of AI, which can easily create such high-quality images.
Blog articles often contain only text information, which can make them look dull. However, it is very convenient to be able to use original images (but be careful of copyright).
There is still a lot of information I don't know to run "Stable Diffusion". I want to continue to enjoy image creation as a hobby while learning various knowledge in the future.
I want to generate images using AI and also practice my English pronunciation. Today seems like a very busy day to indulge in my hobbies.
No comments:
Post a Comment