論文メモ：読んだ論文

2023/11/25

A Method for Animating Children's Drawings of the Human Figure https://3dvar.com/Smith2023A.pdf

人間ベースの子どもの絵を動かす。
物体検出、ポーズ推定タスクはリアルの人間で訓練されたモデルをファインチューニングする。

Interactive Sketching of Mannequin Poses https://arxiv.org/pdf/2212.07098.pdf

マネキンのポーズを予測する。
ユーザーがrefineする

Interactive pose and shape editing with simple sketches from different viewing angles https://pdf.sciencedirectassets.com/271576/1-s2.0-S0097

前と横の簡単に描けるイラストからポーズとシェイプを予測する。

データ生成プロセス
体形のデコード

RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-consistent Dataset [https://gaplab.cuhk.edu.cn/projects/RaBit/] [https://arxiv.org/pdf/2303.12564.pdf]

1500体のトポロジーに一貫性のある二足歩行のカートゥーンキャラクターの3Dモデルからパラメータ変形可能なモデルとGANベースのテクスチャ生成をする。
ランドマークは関節位置の計算にも使われる。
アーティストがモデリングするときは、テンプレートを変形していくときに関節位置の関係を保つ必要がある。

Few-shot learning

Few-shot Learning : 少ない画像データで学習する【前編】 | One Tech Blog https://tech.gmogshd.com/few-shot-learning/

インタラクティブ機械学習（IMT）

インタラクティブ機械学習の教示プロセスでの生成AIの利用可能性の検討 https://iis-lab.org/wp-content/uploads/2023/06/DICOMO2023_Yamamoto.pdf

近年の機械学習技術の発展に伴い，機械学習を利用する人が増えたものの機械学習モデルを作成できる人は少ない

大規模言語モデル時代のHuman-in-the-Loop機械学習 - Speaker Deck https://speakerdeck.com/yukinobaba/human-in-the-loop-ml-llm?slide=6

画像データに対するActive learningの現状と今後の展望 ~最新の教師なし学習を添えて~ - ABEJA Tech Blog https://tech-blog.abeja.asia/entry/active-learning-20200503

FiG-NeRF: Figure-Ground Neural Radiance Fields for 3D Object Category Modelling https://arxiv.org/pdf/2104.08418.pdf

https://www.docswell.com/s/8590143908/5RXM61-2023-10-25-145247#p2

一貫した背景シーン上のあるカテゴリーの様々な写真からそのカテゴリーモデルを作成しセグメンテーションする。NeRFを使用。
オブジェクトの形の違いを吸収するDeformation Fieldは使えるかも。

RAC: Reconstructing Animatable Categories from Videos https://gengshan-y.github.io/rac-www/index.html

[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos | ドクセル https://www.docswell.com/s/DeepLearning2023/Z6YLN4-dlbanmo-building-animatable-3d-neural-models-from-many-casual-videos#p13

LoRA の論文紹介 https://zenn.dev/schwalbe10/articles/low-rank-adaptation

[輪講資料] LoRA: Low-Rank Adaptation of  Large Language Models - Speaker Deck https://speakerdeck.com/hpprc/lun-jiang-zi-liao-lora-low-rank-adaptation-of-large-language-models?slide=29

[Stable Diffusion] 追加学習の理論 | henatips https://henatips.com/page/49/

SegGPT: Segmenting Everything In Context https://arxiv.org/pdf/2304.03284.pdf https://encord.com/blog/seggpt-segment-everything-in-context-explainer/#h5

マルチモーダルLLM セグメンテーションができる

MiniGPT-V

マルチモーダルLLM

画像を高機能に読みとろう。MiniGPT-VをAPI化しプログラムから使うー（１）｜めぐチャンネル https://note.com/ai_meg/n/na1b61f3dbfc4
実用化を目指す。最新版画像が読めるAI MiniGPT-v2にマルチセッション対応APIを追加｜めぐチャンネル https://note.com/ai_meg/n/nc84cee7c58b2

たった10枚の画像でクロスドメインでのGAN学習に成功 | AI-SCHOLAR | AI：(人工知能)論文・技術情報メディア https://ai-scholar.tech/articles/gan/Few-shot_cross