In selected columns in R, replace all NA with FALSE
If you want to perform substitution on a subset of variables, you can still use thees.na(*) <-
trick as follows:
df[c("x1", "x2")][en.in(df[c("x1", "x2")])] <- FALSE
In my opinion, using temporary variables makes the logic easier:
variables.a.replace <- c("x1", "x2")
df2 <- df[vars.to.replace]
df2[en.na(df2)] <- FALSE
df[vars.to.replace] <- df2
How to replace NA values in a table for selected columns
You can do:
x[, 1:2][es.na(x[, 1:2])] <- 0
or better (IMHO) use the variable names:
x[c("a", "b")][es.na(x[c("a", "b")])] <- 0
In both cases,1:2
ÖTaxi")
can be replaced by a predefined array.
Replace only some NA values for selected rows and only for one column in R
df$type[!df$Ask & is.na(df$type)] <- "Antworten"
takes you to the desired table:
> enter <-
+ c(NA, rep("Question",3), NA, NA, rep("Answer",4), rep(NA,3), rep("Answer",2),
+ AND, "Frage", AND, rep("Answer",2), AND, AND)
> Asked <- c(
+ V, repeat (F, 9), V, repeat (F, 4), V, repeat (F, 4), V, F
+)
> df <- data.frame(title=1:22,comments=1:22,type,asked)
> df$type[!df$Asked & is.na(df$type)] <- "Antworten"
> d.f.
Title Comment Type Requested
1 1 1 <NA> TRUE
2 2 2 Question WRONG
3 3 3 question WRONG
4 4 4 question WRONG
5 5 5 answers WRONG
6 6 6 answers WRONG
7 7 7 Answer WRONG
8 8 8 Answer WRONG
9 9 9 Answer WRONG
10 10 10 Answer WRONG
11 11 11 <NA> TRUE
12 12 12 answers WRONG
13 13 13 answers WRONG
14 14 14 Answer WRONG
15 15 15 Answer WRONG
16 16 16 <NA> TRUE
17 17 17 Question WRONG
18 18 18 answers WRONG
19 19 19 Answer WRONG
20 20 20 Answer WRONG
21 21 21 <NA> TRUE
22 22 22 answers WRONG
R Replace NA for all columns except *
You can usemutate_in
:
library (dplyr)
Delete them by name
df %>% mutate_at(vars(-c(Date, thatCol)), ~replace(., is.na(.), 0))
Delete them by position
df %>% mutate_at(-c(1,4), ~replace(., is.na(.), 0))
Select them by name
df %>% mutate_at(vars(col1, thisCol, col999), ~replace(., is.na(.), 0))
Select them by position
df %>% mutate_at(c(2, 3, 5), ~replace(., is.na(.), 0))
if you want to usereplace_in
df %>% mutate_at(vars(-c(Datum, thatCol)), alignr::replace_na, 0)
look at thatmutate_in
will soon be replaced bybetween
Shedplyr 1.0.0
.
How to represent NA as false in R
For me I think the most beneficial way would be to use adplyr
voncase_when
Function and explicitly declare asTHE
The cases you mentioned need to be processed.
Replicating your example (note that I'm explicitly defining NAs here. Your NAs were the result of R failing to handle a string ("NA") in a numeric array.
col1 = as.numeric(c(10, 2, 15, 2, NA_real_, 15))
col2 = as.numeric(c(15, 15, 2, 2, 15, NA_real_))
heads <- data.frame(column1, column2)
so much for himspielen
function andcase_when
function I loaddplyr
. If you are not familiarcase_when
It's like an ifelse with multiple conditions. Each condition is followed by a tilde "~". What comes after the tilde is what gets assigned if the condition is true. To set "everything else" to an X value, typeTRUE ~ "x"
since this will obviously evaluate to true for all other cases not met in the above conditions.
This should do what you want:
library (dplyr)tests <-moved (.data = tests,
G5 = case_when(col1 > 5 & col2 > 5 ~ "Sim", #Original
(es.na(Split1) & Split2 > 5) | (col1 > 5 & is.na(col2)) ~ "Sim",
TRUE ~ "No")) # Everything else is set to "No".
prove
#> col1 col2 G5
#> 1 10 15 SIM Card
#> 2 2 15 No
#> 3 15 2 No
#> 4 2 2 No
#> 5 OF 15 Yes
#> 6 15 E Yes
Replace NA with an interpolated value for specific column fields in r
It is not specified?na.ca.
An object with a structure similar to the object with NAs replaced by interpolation. see you approx.only internal NAs are replacedand leading or trailing NAs are omitted if na.rm = TRUE, or not replaced if na.rm = FALSE.
By default, theca.
EUna.rm = TRUE
na.approx(objeto, x = index(objeto), xout, ..., na.rm = TRUE, maxgap = Inf, Along)
So we can change the code to
mis_datos[, 42] <- na.approx(mis_datos[, 42], na.rm = FALSE)
In a large dataset it is possible to have leading/trailing NAs and using the op-code will result in an output vector with fewer elements likena.rm = TRUE
, which triggers the length difference error when replacing
Replace selected column values based on another dataframe with a different size
Data:
dfa <- read.table(text="Accesión Columna1 Columna2 Columna3 Id. de raíz
2000_1 0 0,2 14 2000 1
2000_2 0,01 0,2 17 2000 2
2001_1 0,012 0,22 11 2001 1
2001_2 0.011 0.231 17 2001 2", head = T)
Libraries and Functions:
library (ordered)cv <- function(x) 100 * (sd(x) / mean(x))
Solution:
Basically, if we get straight to the point and look at the end result, you want to replace the values inColuna1: Coluna3
comTHE
if CV is greater than 30. Otherwise you want to keep the original values. The following code does this.
DFA %>%
group_by(Root) %>%
move_at(vars(Column1:Column3),
list ( ~ if ( cv ( . ) > 30 ) ONLY . ))
Result:
#> #Erm Tibble: 4 x 6
#> Membership Column 1 Column 2 Column 3 Root ID
#> <fct> <dbl> <dbl> <dbl> <int> <int>
#> 1 2000_1 n/a 0.2 14 2000 1
#> 2 2000_2 n/a 0.2 17 2000 2
#> 3 2001_1 0.012 0.22 n/a 2001 1
#> 4 2001_2 0.011 0.231 n/a 2001 2nd
More complicated approaches:
If we follow his train of thought, we end up with more complicated code, shown below;
DFA %>%
select_if(function(col) is.numeric(col) & all(col != .$ID)) %>%
group_by(Root) %>%
summarize_each(list(resume)) %>%
move_at(vars(Column1:Column3),
list(~ if not(. > 30, NA, 0))) %>%
left_join(dfa[,c("Root", "ID")], . , by = "Root") %>%
bind_rows(dfa, .) %>%
group_by(Raíz, ID) %>%
summarise_each(list(~ if(is.numeric(.)) sum(., na.rm = FALSE) else first(.))) %>%
Ungroup %>%
select(-ID, -Root, tudo())
Explanation:
- Selection of numeric columns except
I WANTED
. - group by
Fuente
. - CV calculation for all columns.
- Replace CV values greater than 30 with
THE
and the rest with 0. I plan on summing with the original values since it seems OP is interested in keeping the NAs (i.e. greater than 30) of this CV matrix but leaving the other values in the original data set unchanged . Then add 0, leaving the last one unchanged while those NA(na.rm = F
) affects the values. - Add the ID column again so the CV array is the same size (per row) as the original record. Also, it will be used later for grouping.
- Link records row by row.
- group by
Fuente
miI WANTED
. - Summarize numeric columns (i.e.
coluna1
,coluna2
etc.) by adding the values of the original data frame and the modified CV matrix and keeping the first value of other columns (since the original data frame was the first inbind_rows
this means that the original values are preserved). - Ungroup to avoid future conflicts.
- Rearranging the columns in the order specified by the OP.
another solutionIt would be very similar to the above, but instead of joining to get the ID column and expanding the CV array, you could keep them from the start by summarizing them as a list column and then nesting them.
DFA %>%
mutate(ID = as.factor(ID)) %>%
group_by(Root) %>%
summarise_each(lista(~ if(es.numeric(.)) cv(.) else lista(.))) %>%
move_at(vars(Column1:Column3),
list(~ if not(. > 30, NA, 0))) %>%
unsest(cols = c(Accessed, ID)) %>%
mutate(ID = as.integer(ID)) %>%
bind_rows(dfa, .) %>%
group_by(Raíz, ID) %>%
summarise_each(list(~ if(is.numeric(.)) sum(., na.rm = FALSE) else first(.))) %>%
Ungroup %>%
select(-ID, -Root, tudo())
Related topics
Create a function expression for Data.Table to Eval
Data plot of a Svm adjustment - hyperplane
Remove row names from Data.Frame when using Xtable
How to merge two columns in R with a specific symbol
The fastest way to fill in missing dates for dates. plank
R Converting between zoo object and dataframe, inconsistent results for different column counts
Non-mandatory namespace dependencies
How to install R package from private repository using Devtools Install_Github
How to combine multiple Ggplot2 elements into one function return
X11 fails to load into R after OS X Yosemite upgrade
Simple frequency tables using Data.Table
How to place multiple charts side by side in Shiny R
How to break out of a foreach loop
What algorithm do I need to find n-grams?
Listing R package dependencies without installing any packages
Using Fortran subprogram in R? Undefined symbol
R Map Language Handout: How to set the use of the English language
A: 'Divide' to preserve the natural order of the factors
FAQs
How do I replace NA across multiple columns in R? ›
Use R dplyr::coalesce() to replace NA with 0 on multiple dataframe columns by column name and dplyr::mutate_at() method to replace by column name and index. tidyr:replace_na() to replace. Using these methods and packages you can also replace NA with an empty string in R dataframe.
How do you remove all Na in a column in R? ›By using na. omit() , complete. cases() , rowSums() , and drop_na() methods you can remove rows that contain NA ( missing values) from R data frame.
How do I replace NA values in a column in R? ›You can replace NA values with zero(0) on numeric columns of R data frame by using is.na() , replace() , imputeTS::replace() , dplyr::coalesce() , dplyr::mutate_at() , dplyr::mutate_if() , and tidyr::replace_na() functions.
How do I replace all NA values in a Dataframe in R? ›You can replace NA values with blank space on columns of R dataframe (data. frame) by using is.na() , replace() methods. And use dplyr::mutate_if() to replace only on character columns when you have mixed numeric and character columns, use dplyr::mutate_at() to replace on multiple selected columns by index and name.
How do I replace all with NA in R? ›Using R replace() function to update 0 with NA
R has a built-in function called replace() that replaces values in a vector with another value, for example, zeros with NAs.
The easiest way to replace NA's with the mean in multiple columns is by using the functions mutate_at() and vars(). These functions let you select the columns in which you want to replace the missing values. To actually replace the NA with the mean, you can use the replace_na() and mean() function.