★ MistralAI 7B (새로운 기초모델) - AI 정보 채널

AI 정보 채널

알림 알림 중 알림 취소

구독자 504명 알림수신 14명 @산정

개인용창고: (언어모델 위주의) 인공지능 논문, 뉴스, 팁, 라이브러리 등

모델 ★ MistralAI 7B (새로운 기초모델)

추천 1 비추천 0 댓글 0 조회수 537 작성일 2023-09-27 15:41:30 수정일 2023-09-28 11:21:11

https://arca.live/b/ai101/87351532

MistralAI-0.1-7B

Mistral AI에서 방금 출시한 새로운 기초 언어 모델.

https://mistral.ai/news/announcing-mistral-7b/

* 7.3B 파라미터

* 모든 벤치마크에서 라마2 13B보다 뛰어난 평가를 획득

* 많은 벤치마크에서 라마1 34B보다 뛰어난 평가를 획득

* 뛰어난 언어능력을 유지하면서도, 코드라마 7B에 버금가는 코딩 성능을 보유

* Apache 2.0 라이선스 (상업적 이용 가능)

* 오리지널 라마 논문의 저자들(Timothee Lacroix, Guillaume Lample, Marie-Anne Lachaux)이 개발진에 포함되어 있음.

* 신속한 추론을 위한 Grouped-query attention (GQA) 사용

* 저비용으로 긴 시퀀스를 다루기 위해 Sliding Window Attention (SWA) 사용

관련 레딧

https://www.reddit.com/r/LocalLLaMA/comments/16tf4qn/mistralai017b_the_first_release_from_mistral/

https://www.reddit.com/r/LocalLLaMA/comments/16tnrpm/mistral_7b_releases_with_claims_of_outperforming/

https://www.reddit.com/r/LocalLLaMA/comments/16twtfn/llm_chatrp_comparisontest_mistral_7b_base_instruct/

다운로드 (원본)

https://huggingface.co/mistralai

https://huggingface.co/mistralai/Mistral-7B-v0.1 (베이스 모델)

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1 (인스트럭션 튜닝 모델)

다운로드 (변환)

https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF

https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF

https://huggingface.co/TheBloke/Mistral-7B-v0.1-AWQ

https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-AWQ

https://huggingface.co/bn22/Mistral-7B-v0.1-sharded

https://huggingface.co/bn22/Mistral-7B-Instruct-v0.1-sharded

https://huggingface.co/kittn/mistral-7B-v0.1-hf

깃허브

https://github.com/mistralai/mistral-src

벤치마크 결과

<span class="fr-mk" style="display:none"> </span><span class="fr-mk" style="display:none"> </span><span class="fr-mk" style="display:none"> </span>

https://twitter.com/i/status/1707100769277595951

벤치마크를 무턱대고 믿을 수야 없지만 어쨌든 반가운 소식이네요.

댓글 [0]

본 게시물에 댓글을 작성하실 권한이 없습니다. 로그인 하신 후 댓글을 다실 수 있습니다. 아카라이브 로그인

전체글 개념글

최근 최근 방문 채널

최근 방문 채널

전체 일반 정보 논문 깃헙 모델 후기 스터디 채팅 음성 이미지 영상 모음 공지

번호 제목

작성자 작성일 조회수 추천

공지 아카라이브 모바일 앱 이용 안내(iOS/Android)

*ㅎㅎ 2020.08.18 29920929

공지 입원으로 자리 비움 // 채널 설명 및 자주 쓰는 링크

산정 2023.05.30 512

682 일반 갓채널 [4]

Ai프로일러레성정문 2024.04.20 466 0

681 후기 GPTs에 인생첫 챗봇인 fourierGPT 를 만들어 보았습니다. [2]

뿌리골무 2023.11.11 1166 3

680 정보 지난 수 개월 간의 ChatGPT 유저 세션 분석 [1]

산정 2023.10.01 905 0

679 정보 llama.cpp: 추측적 디코딩 + 문법 지원

산정 2023.10.01 450 0

678 모델 NexusRaven-13B: 함수 호출(function calling) 특화 언어모델

산정 2023.10.01 308 0

677 논문 ★ QA-LoRA: 대규모 언어 모델의 양자화 인식 로라

산정 2023.09.27 418 0

676 모델 ★ MistralAI 7B (새로운 기초모델)

산정 2023.09.27 538 1

675 스터디 ★ [번역] 들쭉날쭉한 경계에 선 켄타우로스와 사이보그

산정 2023.09.26 284 1

674 정보 언어모델 GGUF 형식으로 직접 변환하는 법 [1]

산정 2023.09.26 1818 0

673 모델 플롯봇(PlotBOT): 소설 플롯 작성 전문 모델 [1]

산정 2023.09.24 339 1

672 논문 ★ 기계 번역의 패러다임 전환: 대규모 언어 모델의 번역 성능 향상

산정 2023.09.23 299 2

671 모델 LLaMa2-LongLoRA (32k 컨텍스트의 70B 모델)

산정 2023.09.23 207 1

670 논문 LongLoRA: 긴 컨텍스트 LLM의 효율적인 파인튜닝

산정 2023.09.23 182 0

669 논문 대규모 언어 모델의 모호성 인식 문맥 내 학습

산정 2023.09.22 176 0

668 논문 에이전트(Agents): 자율 언어 에이전트를 위한 오픈 소스 프레임워크

산정 2023.09.21 136 0

667 논문 LLM의 추론을 개선하는 대조적 디코딩(Contrastive Decoding)

산정 2023.09.20 209 1

666 논문 고속 피드포워드 네트워크(Fast Feedforward Networks) [1]

산정 2023.09.20 172 0

665 정보 미로스탯(Mirostat) 파라미터

산정 2023.09.20 136 0

664 논문 모듈포머(ModuleFormer): 전문가 혼합(MoE)에서 발현되는 모듈성 (IBM)

산정 2023.09.19 153 0

663 스터디 ★ [번역] 단 하나의 예제로도 언어모델은 배울 수 있나요?

산정 2023.09.19 284 1

전체글 개념글

사용하고 계신 브라우저가 시간대 설정을 지원하지 않으므로 GMT 시간대가 적용됩니다.