arXiv - Yahoo Malaysia Search Results

Search results

arxiv.org › abs › 2112[2112.10741] GLIDE: Towards Photorealistic Image Generation ... -...

arxiv.org › abs › 2112
- Cached
Dec 20, 2021 · Diffusion models have recently been shown to generate high-quality synthetic images, especially when paired with a guidance technique to trade off diversity for fidelity. We explore diffusion models for the problem of text-conditional image synthesis and compare two different guidance strategies: CLIP guidance and classifier-free guidance. We find that the latter is preferred by human ...
arxiv.org › searcharXiv.org e-Print archive

arxiv.org › search
We would like to show you a description here but the site won’t allow us.
arxiv.org › abs › 2212Robust Speech Recognition via Large-Scale Weak Supervision

arxiv.org › abs › 2212
- Cached
Dec 6, 2022 · We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual and multitask supervision, the resulting models generalize well to standard benchmarks and are often competitive with prior fully supervised results but in a zero-shot transfer setting without the need for any fine ...
arxiv.org › abs › 2101[2101.11986] Tokens-to-Token ViT: Training Vision ... - arXiv.org

arxiv.org › abs › 2101
- Cached
Jan 28, 2021 · Transformers, which are popular for language modeling, have been explored for solving vision tasks recently, e.g., the Vision Transformer (ViT) for image classification. The ViT model splits each image into a sequence of tokens with fixed length and then applies multiple Transformer layers to model their global relation for classification. However, ViT achieves inferior performance to CNNs ...
arxiv.org › abs › 2311Computer Science > Computer Vision and Pattern Recognition - ...

arxiv.org › abs › 2311
- Cached
Nov 28, 2023 · Character Animation aims to generating character videos from still images through driving signals. Currently, diffusion models have become the mainstream in visual generation research, owing to their robust generative capabilities. However, challenges persist in the realm of image-to-video, especially in character animation, where temporally maintaining consistency with detailed information ...
arxiv.org › abs › 2401[2401.04070] Accelerating computational materials discovery with...

arxiv.org › abs › 2401
- Cached
Jan 8, 2024 · High-throughput computational materials discovery has promised significant acceleration of the design and discovery of new materials for many years. Despite a surge in interest and activity, the constraints imposed by large-scale computational resources present a significant bottleneck. Furthermore, examples of large-scale computational discovery carried through experimental validation remain ...
info.arxiv.org › help › findFinding Articles - arXiv info

info.arxiv.org › help › find
- Cached
All arXiv submissions are assigned a unique identifier of the form yymm.nnnnn (or arch-ive/yymmnnn for older submissions). To retrieve the abstract page a paper simply enter the identifier in the " Search or Article-id " box in the top right of most pages.

Searches related to arXiv

arXiv preprint arXiv
arXivid
arXiv preprint arXiv:1811.02629

Yahoo Malaysia Web Search

Search results

arxiv.org › abs › 2112[2112.10741] GLIDE: Towards Photorealistic Image Generation ... -...

arxiv.org › searcharXiv.org e-Print archive

arxiv.org › abs › 2212Robust Speech Recognition via Large-Scale Weak Supervision

arxiv.org › abs › 2101[2101.11986] Tokens-to-Token ViT: Training Vision ... - arXiv.org

arxiv.org › abs › 2311Computer Science > Computer Vision and Pattern Recognition - ...

arxiv.org › abs › 2401[2401.04070] Accelerating computational materials discovery with...

info.arxiv.org › help › findFinding Articles - arXiv info

Searches related to arXiv

Related searches