fangtongen
			's Collections
			 
		
			
				
				
	
	
	
			
			The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only
		
			Paper
			
•
			2306.01116
			
•
			Published
				
			•
				
				41
			
 
	
	 
	
	
	
			
			FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness
		
			Paper
			
•
			2205.14135
			
•
			Published
				
			•
				
				15
			
 
	
	 
	
	
	
			
			RoFormer: Enhanced Transformer with Rotary Position Embedding
		
			Paper
			
•
			2104.09864
			
•
			Published
				
			•
				
				16
			
 
	
	 
	
	
	
			
			Language Models are Few-Shot Learners
		
			Paper
			
•
			2005.14165
			
•
			Published
				
			•
				
				17
			
 
	
	 
	
	
	
			
			The Pile: An 800GB Dataset of Diverse Text for Language Modeling
		
			Paper
			
•
			2101.00027
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			Fast Transformer Decoding: One Write-Head is All You Need
		
			Paper
			
•
			1911.02150
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			Llama 2: Open Foundation and Fine-Tuned Chat Models
		
			Paper
			
•
			2307.09288
			
•
			Published
				
			•
				
				246
			
 
	
	 
	
	
	
			
			LLaMA: Open and Efficient Foundation Language Models
		
			Paper
			
•
			2302.13971
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			Orca: Progressive Learning from Complex Explanation Traces of GPT-4
		
			Paper
			
•
			2306.02707
			
•
			Published
				
			•
				
				47
			
 
	
	 
	
	
	
			
			BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity
  Text Embeddings Through Self-Knowledge Distillation
		
			Paper
			
•
			2402.03216
			
•
			Published
				
			•
				
				6
			
 
	
	 
	
	
	
		
			Paper
			
•
			2310.06825
			
•
			Published
				
			•
				
				55