AI视频的一下相关技术


Fooocus可以生成逼真的图片。
Jaggernaut V9 基本模型the base model
拥有GPU的电脑,或者借助平台类似: platforms like RunDiffusion

要做到让模型动起来,除了借助技术外,还需要很多东西:
 Not just that. Extra content, free stuff, exclusive photos, lingerie shots, topless and nudes. 

But I'm using Juggernaut V9 mostly as base model. 基础的模型是用 Juggernaut V9, 一致的图像的问题,提示词工程是必须的。。。
For consistent images, prompt engineering is a must.

 Hey. I’m working on Fooocus UI with different SDXL models locally.  我们使用 Fooocus UI 里的不同的SDXL模型。 我们有一个提示词工程的工作流。 I can’t reveal so much about consistency - but we have a workflow with prompt engineering. 

脚本和场景:ChatGPT
角色和背景图:Midjourney
背景移除:photoshop/ remove.bg
音效:Eleven Labs beta.elevenlabs.io
动画:RunDiffusion/StableDiffusion/Genmo(rundiffusion.com/stability.ai/genmo.ai)

 Stable DiffusionAI 图像生成工具,一个基于 Latent Diffusion Models(潜在扩散模型,LDMs)的文图生成(text-to-image)模型,生成图片的原理是通过模拟稳定扩散过程并生成相应的数据,然后将数据可视化展示为图片,以帮助用户更好地理解实验结果和数据特征

图像领域大火的深度生成模型Diffusion Model, 通过文本生成图像,文本和图像的学习CLIP模型,使用CLIP模型生成输入文字embedding
扩散模型Diffusion: Diffusion 模型的训练可以分为两个部分: 
1. 前向扩散过程(Forward Diffusion Process)→图片中添加噪声;
2. 反向扩散过程(Reverse Diffusion Process)→去除图片中的噪声


Not really seeing this on here:
  •     Read up on Ollama or Stable Diffusion
  •     Have a decent powered GPU
  •     Play around with modeling to get good at it
  •     Create a separate IG account that labels it as AI
  •     Try your luck.
Modeling has gotten really good now with little effort needed


if you are wondering how to  make a fully generated AI video
look no further I will show you exactly what tools  I used and how you can get the same results like 
mine which can be seen at the end of this video  let's start with first things first for a good 
video you need a script characters scenes moving  heads and some voices right so here's the list 
of tools that I used and let me show exactly  how I used each one of them and then I will 
show you how they work first one is chatgpt for  ideas scripts conversations and scene write-ups 
then we used MidJourney for characters and scene  designs remove.bg for background removal which is 
behind our characters Eleven labs for artificial  voices that are the most similar to my characters 
and then a combination of stable diffusion  run diffusion and genmo for animation of  
the images the ID for Talking Heads and vision to  generate tweets but this can be done through many  
different tools as well as the code inspect and  lastly cartoon face as I'm still not well known  
I had to recreate my head with a simpler tool  because MidJourney was giving me okayish results  
but I really wanted something better so how did  I actually use these tools from ChatGPT I needed  
the ideas and what the actual video is going to be  about so I first fed the tool with freelancer.com  
contest that I wanted to participate in and added  a prompt for it to create potential directions  
for my new video once I got the creative direction  for my video I instructed a tool to create new and  
different scenes conversations and characters as  well as some celebrity cameos and lastly I needed  
the name for my show so I prompted chatgpt to give  me a number of potential names this is more of a  
reference for the video to get a great name you  need to play around a bit with it and provide  
feedback to look at the desired results but it  shouldn't take long thanks to the airprm tool for  
chatgpt you can open a new thread or with airprm  find the MidJourney prompt generator and then  
just copy and paste the adjusted brief and scenes  from chatgpt which input straight to Midjourney
next on my list was Midourney once I actually  got my script scenes and characters I needed  
to find real life images of those mentioned  characters because Midjourney will not create  
a good looking image of The mentioned celebrity  without the reference image I will show you just  
an example and you will need to replicate this  setup for all of the characters in your video  
for those of you that are new here Midjourney is  a brilliant tool that can produce amazing images  
just out of plain text and with every prompt in  Midourney you start with first adding /imagine  
then you go and copy the image URL of the desired  person and here for us we have Margot Robbie and  
then you paste that link into the prompt once that  is done you can use the prompts from airprm which  
is a chatgpt plugin for prompt creation or create  them yourself in my case because my show is set  
on the island I will have a very descriptive  words and lastly and most importantly I need  
my style and my parameters I did a very specific  style for its reference and an AR of 16:9 which  
is a resolution because I will need this format  later for my YouTube video make sure to copy it  
like I did because just a small mistake and it  won't work also I will need my background scenes  
so I just reuse the previous prompt to create  consistency without the URL of Margo next thing  
that we need to do is to remove background from  our characters and add the scenes in behind and  
this part is super easy I used Photoshop because  I know how to use it and it's a bit easier for me  
but you can go to remove.bg if you do not have  experience add the AI generated photo of Margo  
wait for the magic to happen and there you have it  now you can use her image to paste it on all other  
backgrounds that you have created such as the  one we created from the same prompt next one is a  
brilliant AI which is called Eleven Labs there is  a big disclaimer here I used the voice of Margo's  
solely for educational purposes but that is her  own property and it shouldn't be used you need to  
be sure that you have the right to use anyone's  image or anyone's voice and have their consent  
now that we got that out of the way in Eleven Labs  we go on advice then instant voice cloning we name  
it describe it and give it a few prompts that  so the tool can understand how it should sound  
like and then generate once we click generate we  have our new voice we click use and you will see  
that you have a couple of voice settings like  speed Clarity or how similar or different it  
is going to be and lastly and most importantly  in the text box the add the text which actually  
needs to be created for image animation I started  with stable diffusion first and its own online  
variation called Rundifusion But realized that I  will need too much time to do it so I found a much  
simpler solution called Genmo I will soon create  a separate tutorials on both stable diffusion  
and run diffusion but for now let's stick to  Genmo Genmo is actually short for Generation  
plus motion and it's a tool that generates  video from text with the help of this tool  
we generated all the movements in the scenes we  generated using Midjourney in our previous steps  
there are three options in general for creating  the first frame or set search create and upload  
we used upload where we uploaded the scenes  that were previously generated we also need  
to adjust the options how long do we want the  video to be how drastic changes do we want how  
fast should the scene change and how smooth do  we want the transition between the frames some of  
our guidelines were that we didn't want to change  the scene too much so we kept exploration around  
30 and dynamism around 17 to 20. it takes about  three minutes for the tool to generate the entire  
video for us so that we can decide if it's good  enough or do if we want to generate it again with  
different parameters now that we have our scenes  characters and voices we actually go in the ID  
and give it a life we go on to create video and  to add our new presenter and here you can do it  
directly with the background that you created  earlier then you upload a voice audio upload  
your own voice and once that is done you simply  generate your video how brilliant and easy is that
and lastly a couple of very easy  generators tweetgen to create feeds  
that are similar or same like mine Star  Wars generators to create similar looking  
text but different fonts and music because  of the IP of course and then face cartoon  
or similar apps to create faces out of  images as cartoons that's only in case  
that midjourney is providing you with correct  Solutions so let me show you our final result  
here and if this was helpful like share  subscribe and see you in the next one




阅读量: 828
发布于:
修改于: