OpenAI อัปเกรดเครื่องมือสร้างรูปภาพด้วย AI ผลลัพธ์สมจริงขึ้น ระบุรายละเอียดได้ดีกว่าเดิม

By: arjin

on 26 March 2025 - 07:37 Tags:

Topics:

OpenAI

Artificial Intelligence

Dall-E

LLM

OpenAI ประกาศอัปเกรดเครื่องมือสร้างรูปภาพขั้นสูงบนโมเดล GPT-4o ที่บอกว่าไม่เพียงแต่ได้รูปที่สวยงามกว่าเดิม แต่สามารถกำหนดรายละเอียดให้ตรงกับความต้องการยิ่งกว่าเดิม

เนื่องจาก GPT-4o เป็นโมเดลที่ค่อย ๆ คิดเป็นขั้นตอน ทำให้การสร้างรูปภาพบนโมเดลนี้จึงสามารถกำหนดรายละเอียด หรือสั่งแก้ไขเป็นส่วนได้ดีกว่า DALL·E ที่เป็นเครื่องมือสร้างรูปภาพตัวเดิม ตัวอย่างที่ OpenAI นำเสนอ เช่น สามารถระบุข้อความที่ปรากฎในรูปภาพอย่างละเอียดแต่ละตำแหน่งได้, สามารถกำหนดหรือแก้ไขภาพที่มีทั้งข้อความและคนในรูปได้, กำหนดรายละเอียดตามลำดับสูงถึง 10-20 รายการใน 1 prompt, สามารถเรียนรู้จากรูปที่อัปโหลดเข้าไปได้, มีความรู้จับคู่ข้อความกับภาพที่สามารถสร้าง Infographic ได้ เป็นต้น (ตัวอย่างที่น่าสนใจอยู่ท้ายข่าว)

OpenAI บอกว่าข้อมูลที่นำมาฝึกฝนเครื่องมือสร้างรูปภาพนี้ใช้ข้อมูลที่มีเผยแพร่แบบสาธารณะ รวมทั้งเป็นข้อมูลจากพาร์ตเนอร์เช่น Shutterstock

เครื่องมือสร้างรูปภาพใหม่บนโมเดล GPT-4o นี้ เริ่มอัปเดตให้ใช้งานตั้งแต่วันนี้สำหรับลูกค้า Plus, Pro, Team และลูกค้าฟรี ผ่าน ChatGPT โดยมีจำนวนจำกัดต่อวันสำหรับบางแผนเหมือนกับ DALL·E ส่วน Enterprise และ Edu จะตามมาในภายหลัง นอกจากนี้ยังสามารถเรียกใช้งานผ่าน Sora ได้ ส่วนคนที่ยังต้องการใช้ DALL·E เดิม ให้เรียกผ่านคัสตอม DALL·E GPT แทน สำหรับนักพัฒนาจะสามารถใช้งานผ่าน API ได้ในอีกไม่กี่สัปดาห์ข้างหน้า

ที่มา: OpenAI

No Description

ตัวอย่าง ภาพที่มีข้อความ ระบุรายละเอียดแต่ละคำ และมีคนในภาพ

magnetic poetry on a fridge in a mid century home:

Line 1: "A picture"
Line 2: "is worth"
Line 3: "a thousand words,"
Line 4: "but sometimes"Large gapLine 5: "in the right place"
Line 6: "can elevate"
Line 7: "its meaning.

"The man is holding the words "a few" in his right hand and "words" in his left.

No Description

ตัวอย่างภาพที่มีป้าย ระบุข้อความจำนวนมาก

Create a photorealistic image of two witches in their 20s (one ash balayage, one with long wavy auburn hair) reading a street sign.

Context:
a city street in a random street in Williamsburg, NY with a pole covered entirely by numerous detailed street signs (e.g., street sweeping hours, parking permits required, vehicle classifications, towing rules), including few ridiculous signs at the middle: (paraphrase it to make these legitimate street signs)"Broom Parking for Witches Not Permitted in Zone C" and "Magic Carpet Loading and Unloading Only (15-Minute Limit)" and "Reindeer Parking by Permit Only (Dec 24–25)\n Violators will be placed on Naughty List." The signpost is on the right of a street. Do not repeat signs. Signs must be realistic.

Characters:
one witch is holding a broom and the other has a rolled-up magic carpet. They are in the foreground, back slightly turned towards the camera and head slightly tilted as they scrutinize the signs.

Composition from background to foreground:
streets + parked cars + buildings -> street sign -> witches. Characters must be closest to the camera taking the shot

No Description

กรณีนี้เริ่มต้นจากการสร้างภาพแมว ใส่รายละเอียดต่าง ๆ จากนั้นให้แมวตัวนี้ตามรายละเอียดที่กำหนดไว้ก่อนหน้า ไปปรากฎในฉากวิดีโอเกมได้

No Description

อธิบายสถานการณ์ของภาพเป็นขั้นตอนเพื่อสร้างรูปขึ้นมาได้

We need evidence there is a currently present invisible elephant. Consider what an elephant is and does in the environment, then show us that, perhaps mid-process - but the elephant itself is not shown at all

No Description

สร้าง Infographic โดยระบุรายละเอียดที่ต้องการ

Make me a professionally shot photorealistic diagram of the top selling cocktails in my bar with recipes labeled on each drink.

put the recipes on handwritten cards in front of each drink.

the cards are brown, and the text is black.

background is white

Title is "4 most popular cocktails"

No Description

สร้างรูปภาพที่สมจริงกว่าเดิม

Generate a photorealistic image of farmer's market in toronto on a saturday in summer 2006, it's a beautiful late june day, people are shopping and eating sandwiches. in focus should be a young asian girl wearing denim overalls and sipping on a strawberry banana smoothie - rest can be blurred. the photo should be reminiscent of that a digital camera from 2006 would take, with a timestamp like a printed photo would have. aspect ratio should be 3:2

No Description

Hiring! บริษัทที่น่าสนใจ

LINE Company Thailand

LINE, the world's hottest mobile messaging platform, offers free text and voice messaging + Call

REFINITIV

The Financial and Risk business of Thomson Reuters is now Refinitiv

LTMH TECH

LTMH TECH มุ่งเน้นการพัฒนาผลิตภัณฑ์ที่สามารถช่วยพันธมิตรของเราให้บรรลุเป้าหมาย

Comments

By: JPorsh

on 26 March 2025 - 08:55 #1336815

ได้ในอีกไม่ "กี่"สัปดาห์ข้างหน้า
ตกคำว่ากี่ไปคำนึงนะครับ

By: mementototem

on 26 March 2025 - 09:28 #1336820

มีอะไรที่บอกได้ง่าย ๆ บ้างไหมครับว่า ภาพพวกนี้คือ Generative Image ไม่ใช่ภาพจริง, ปัญหาที่ผมเจออยู่ตอนนี้คือ ถ้ามันเหมือนจริงมาก ๆ ผมแทบไม่รู้เลยว่านี่ภาพจริงหรือเปล่า โดยเฉพาะใน field ที่ผมไม่ได้เชี่ยวชาญ

มันมีเครื่องมืออะไรที่แบบว่า เปิดรูปผ่าน browser แล้วคลิกปุ๊บรู้เลยว่าอันนี้มัน Generative Image หรืออันนี้ของจริง หรือถ้าไม่มีเครื่องมือใด ๆ การที่ใครใช้ Generative AI แล้วแปะป้ายสักหน่อยว่ารูปนี้ gen ขึ้นมานะ ไม่ใช่ของจริง มันคงจะดีมาก ๆ ช่วยในการตัดสินใจได้เยอะเลยว่า จะใช้รูปนั้นในการอ้างอิงข้อมูลได้หรือไม่

มันก็จริงที่ว่า เมื่อก่อนก็มีการปลอมแปลงรูป photoshop มีการแต่งเรื่อง แต่งข้อความเพื่อผลประโยชน์บางอย่างมานมนานแล้ว แต่ Generative AI ทำให้มันง่ายขึ้น ง่ายมาก ๆ แค่ไม่กี่วินาทีก็ได้ภาพ/ข้อมูลจำนวณมาก ที่ดูน่าเชื่อถือระดับหนึ่งแล้ว มันทำให้ไม่รู้ว่าควรจะเชื่ออะไรดีอีกต่อไป โดยเฉพาะใน field ที่ไม่เชี่ยวชาญ ข้อมูลค่อนข้างน้อยอยู่แล้ว หรือพิสูจน์ความถูกต้องได้ยาก

ถ้ามีทางแก้ หาทางรับมือกับมัน ซึ่งเป็นทางที่ง่าย สะดวก มันคงจะดี

Jusci - Google Plus - Twitter

By: IDCET

on 26 March 2025 - 15:55 #1336857 Reply to:1336820

ผมว่าตัว AI Generator น่าจะมีใส่ลายน้ำ, dot mark ขนาดเล็กในรูปภาพ กับ metadata ในไฟล์รูปภาพมาครับ

ตัวอย่างที่ใกล้เคียง ก็เครื่อง Printer ที่จะมี yellow dot ขนาดเล็ก ถูกพิมพ์ออกมาพร้อมกับเอกสารบนกระดาษ ใช้ในการตรวจสอบเอกสารปลอม กับการป้องกันการพิมพ์ธนบัตรปลอม

ความล้มเหลว คือจุดเริ่มต้นสู่ความหายนะ มีผลกระทบมากกว่าแค่เสียเงิน เวลา อนาคต และทรัพยากรที่เสียไป - จงอย่าล้มเหลว

By: Hoo

on 26 March 2025 - 23:03 #1336885 Reply to:1336820

ตอนนี้ เนียนขึ้นเรื่อยๆ
ตัวช่วยในปัจจุบันคือ ตัว gen ใส่ meta/ลายน้ำ ของภาพมาให้ว่า AI gen

แต่ถ้าวันนึงมีการ hack bypass การใส่ meta/ลายน้ำ
หรือลงทุนสร้างตัว gen ที่ไม่ใส่ meta/ลายน้ำได้ จะน่ากลัวมาก
โดยเฉพาะนำมา gen เพื่อหวังผลทางการเมือง

By: mementototem

on 27 March 2025 - 09:22 #1336901 Reply to:1336820

คือเข้าใจทั้ง 2 ท่านนะครับว่า มันจะมี metadata หรือ watermarks เมื่อ gen ออกมา แต่ผมก็ยังไม่ทราบว่าจะดูยังไงอยู่ดี ผมดู image info ผ่านฟังชั่นของ Firefox และ file properties ของ Windows ของภาพด้านบนในข่าวนี้ ก็ไม่มีส่วนไหนบอกเลยว่า นี่เป็นภาพที่ใช้ Generative AI หน่ะครับ

Jusci - Google Plus - Twitter

By: By_Myself

on 28 March 2025 - 14:31 #1337002 Reply to:1336901

มีปลักอินโครมอยู่ครับ สงสัยรูปไหนลากลงตรรวจได้เลย

/>o</

By: mementototem

on 29 March 2025 - 11:43 #1337067 Reply to:1337002

ตัวไหนเหรอครับ ผมเจอแต่ตัวที่ต้อง login ก่อน ซึ่งผมไม่ได้ลองนะครับ

ผมลองใช้ c2pa verify tool กับรูปที่ทาง blognone โพสต์ในข่าวนี้ ก็ไม่เจออะไรนะครับ ทั้งที่ถ้าผมจำไม่ผิด OpenAI บอกว่าจะทำให้ตรวจสอบด้วย c2pa ได้

Jusci - Google Plus - Twitter

Main menu