Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 39 additions & 1 deletion BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -547,6 +547,7 @@ cc_library(
deps = [
":basics",
":configs",
":flash_structs",
":gemma_args",
":kv_cache",
":mat",
Expand Down Expand Up @@ -594,6 +595,11 @@ cc_test(

INTERNAL_DEPS = []

cc_library(
name = "flash_structs",
hdrs = ["gemma/flash_structs.h"],
)

cc_library(
name = "attention",
srcs = [
Expand All @@ -603,7 +609,6 @@ cc_library(
hdrs = [
"gemma/attention.h",
"gemma/flash_attention.h",
"gemma/flash_structs.h",
],
textual_hdrs = [
"gemma/gemma-inl.h",
Expand All @@ -612,6 +617,7 @@ cc_library(
":activations",
":basics",
":configs",
":flash_structs",
":kv_cache",
":mat",
":matmul",
Expand Down Expand Up @@ -822,6 +828,38 @@ cc_test(
],
)

cc_test(
name = "wheat_from_chaff_test",
srcs = ["evals/wheat_from_chaff_test.cc"],
data = [
"evals/testdata/google/big_bang_theory.txt",
"evals/testdata/google/black_hole.txt",
"evals/testdata/google/general_relativity.txt",
"evals/testdata/google/qed.txt",
"evals/testdata/holiday_story.txt",
"evals/testdata/quark_1.txt",
"evals/testdata/quark_2.txt",
"evals/testdata/special_relativity.txt",
"evals/testdata/standard_model.txt",
],
linkstatic = True,
# Requires model files
tags = [
"local",
"manual",
"no_tap",
],
deps = [
":benchmark_helper",
":configs",
":gemma_lib",
"@googletest//:gtest_main", # buildcleaner: keep
"//io",
"@highway//:abort_header_only",
"@highway//:hwy_test_util",
],
)

cc_binary(
name = "gemma",
srcs = ["gemma/run.cc"],
Expand Down
6 changes: 5 additions & 1 deletion evals/benchmark_helper.cc
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,11 @@ std::vector<QueryResult> GemmaEnv::BatchQueryModel(

QueryResult GemmaEnv::QueryModel(const std::string& input) {
const std::vector<int> prompt = WrapAndTokenize(input);
return QueryModel(prompt);
auto result = QueryModel(prompt);
fprintf(stderr, "prompt size: %zu, response size: %zu, total tokens: %zu\n",
prompt.size(), result.tokens_generated - prompt.size(),
result.tokens_generated);
return result;
}

QueryResultAndMetrics GemmaEnv::BatchQueryModelWithMetrics(
Expand Down
2 changes: 2 additions & 0 deletions evals/benchmark_helper.h
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ class GemmaEnv {
static_cast<size_t>(max_generated_tokens);
}

void PrintProfileResults() { ctx_.profiler.PrintResults(); }

std::vector<int> Tokenize(const std::string& input) const {
std::vector<int> tokens;
HWY_ASSERT(gemma_.Tokenizer().Encode(input, &tokens));
Expand Down
24 changes: 14 additions & 10 deletions evals/gemma_batch_bench.cc
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@ GemmaEnv* s_env = nullptr;
class GemmaBatchBench : public ::testing::Test {
protected:
std::vector<std::string> BatchGemmaReply(
const std::vector<std::string>& inputs) {
const std::vector<std::string>& inputs, AttentionImpl attention_impl) {
s_env->MutableConfig().attention_impl = attention_impl;
s_env->MutableConfig().temperature = 0.0f; // deterministic
s_env->MutableConfig().verbosity = 2;
std::vector<std::string> replies;
Expand Down Expand Up @@ -128,16 +129,19 @@ std::vector<std::string> GenerateInputs() {
TEST_F(GemmaBatchBench, RandomQuestionsBatched) {
s_env->SetMaxGeneratedTokens(12);
const std::vector<std::string> inputs = GenerateInputs();

// Run multiple times so that auto-tuning is closer to complete.
for (size_t rep = 0; rep < 4; ++rep) {
std::vector<std::string> responses = BatchGemmaReply(inputs);
for (size_t i = 0; i < HWY_MIN(hwy::Unpredictable1() * 3, responses.size());
++i) {
fprintf(stderr, "Rep %zu batch answer %zu '%s'\n\n", rep, i,
responses[i].c_str());
const AttentionImpl modes[] = {AttentionImpl::kOld, AttentionImpl::kFlash};
for (const AttentionImpl mode : modes) {
// Run multiple times so that auto-tuning is closer to complete.
fprintf(stderr, "Testing mode %s\n", GetAttentionImplName(mode).c_str());
for (size_t rep = 0; rep < 4; ++rep) {
std::vector<std::string> responses = BatchGemmaReply(inputs, mode);
for (size_t i = 0;
i < HWY_MIN(hwy::Unpredictable1() * 3, responses.size()); ++i) {
fprintf(stderr, "Rep %zu batch answer %zu '%s'\n\n", rep, i,
responses[i].c_str());
}
PROFILER_PRINT_RESULTS();
}
PROFILER_PRINT_RESULTS();
}
}

Expand Down
10 changes: 10 additions & 0 deletions evals/testdata/holiday_story.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Albert and Marcia were on holiday. Their parents had brought them to the beach.
Albert was generally unimpressed with beaches, as he would rather explore a dark forest and see the variety of mosses and fungi that grow in the damp conditions.
On the other hand, Marcia loved to build enormous sand castles.
Albert enjoyed collecting limpet shells to decorate the outer walls of the turrets, which he secretly thought made them look like daleks.
Whilst digging sand for building, Marcia always liked to dig deep, to see if she could get to water coming through the sand from the sea.
When the castle was nearly complete, and Marcia needed more sand, she hit a large piece of rusty metal.
Curious as to what it was, Marcia kept digging to try to expose all of it, but it was very big and hard to get at as it was so deep in the sand.
Excited by the prospect of finding something unusual in the sand, Albert joined in to help dig out the entire object.
Almost an hour later, they had exposed most of a ship’s anchor.
During the excavation a crowd on onlookers had formed around them, who then proceeded to take selfies in front of the unusual piece of beach litter.
21 changes: 21 additions & 0 deletions evals/testdata/quark_1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Text from https://en.wikipedia.org/wiki/Quark is licensed under Creative Commons Attribution-ShareAlike 4.0 License; (https://en.wikipedia.org/wiki/Wikipedia:Text_of_the_Creative_Commons_Attribution-ShareAlike_4.0_International_License)

Quark
From Wikipedia, the free encyclopedia
(Redirected from Quarks)
This article is about the elementary particle and its antiparticle. For other uses, see Quark (disambiguation).
A quark (/ˈkwɔːrk, ˈkwɑːrk/ ⓘ) is a type of elementary particle and a fundamental constituent of matter. Quarks combine to form composite particles called hadrons, the most stable of which are protons and neutrons, the components of atomic nuclei.[1] All commonly observable matter is composed of up quarks, down quarks and electrons. Owing to a phenomenon known as color confinement, quarks are never found in isolation; they can be found only within hadrons, which include baryons (such as protons and neutrons) and mesons, or in quark–gluon plasmas.[2][3][nb 1] For this reason, much of what is known about quarks has been drawn from observations of hadrons.

Quarks have various intrinsic properties, including electric charge, mass, color charge, and spin. They are the only elementary particles in the Standard Model of particle physics to experience all four fundamental interactions, also known as fundamental forces (electromagnetism, gravitation, strong interaction, and weak interaction), as well as the only known particles whose electric charges are not integer multiples of the elementary charge.

There are six types, known as flavors, of quarks: up, down, charm, strange, top, and bottom.[4] Up and down quarks have the lowest masses of all quarks. The heavier quarks rapidly change into up and down quarks through a process of particle decay: the transformation from a higher mass state to a lower mass state. Because of this, up and down quarks are generally stable and the most common in the universe, whereas strange, charm, bottom, and top quarks can only be produced in high energy collisions (such as those involving cosmic rays and in particle accelerators). For every quark flavor there is a corresponding type of antiparticle, known as an antiquark, that differs from the quark only in that some of its properties (such as the electric charge) have equal magnitude but opposite sign.

The quark model was independently proposed by physicists Murray Gell-Mann and George Zweig in 1964.[5] Quarks were introduced as parts of an ordering scheme for hadrons, and there was little evidence for their physical existence until deep inelastic scattering experiments at the Stanford Linear Accelerator Center in 1968.[6][7] Accelerator program experiments have provided evidence for all six flavors. The top quark, first observed at Fermilab in 1995, was the last to be discovered.[5]

Classification
See also: Standard Model
A four-by-four table of particles. Columns are three generations of matter (fermions) and one of forces (bosons). In the first three columns, two rows contain quarks and two leptons. The top two rows' columns contain up (u) and down (d) quarks, charm (c) and strange (s) quarks, top (t) and bottom (b) quarks, and photon (γ) and gluon (g), respectively. The bottom two rows' columns contain electron neutrino (ν sub e) and electron (e), muon neutrino (ν sub μ) and muon (μ), and tau neutrino (ν sub τ) and tau (τ), and Z sup 0 and W sup ± weak force. Mass, charge, and spin are listed for each particle.
Six of the particles in the Standard Model are quarks (shown in purple). Each of the first three columns forms a generation of matter.
The Standard Model is the theoretical framework describing all the known elementary particles. This model contains six flavors of quarks (q), named up (u), down (d), strange (s), charm (c), bottom (b), and top (t).[4] Antiparticles of quarks are called antiquarks, and are denoted by a bar over the symbol for the corresponding quark, such as u for an up antiquark. As with antimatter in general, antiquarks have the same mass, mean lifetime, and spin as their respective quarks, but the electric charge and other charges have the opposite sign.[8]

Loading
Loading