Natural language inference (NLI) is a central problem in natural language processing (NLP) of predicting the logical relationship between a pair of sentences. Lexical knowledge, which represents relations between words, is often important for solving NLI problems. This knowledge can be accessed by using an external knowledge base (KB), but this is limited to when such a resource is accessible. Instead of using a KB, we propose a simple architectural change for attention based models. We show that by adding a skip connection from the input to the attention layer we can utilize better the lexical knowledge already present in the pretrained word embeddings. Finally, we demonstrate that our strategy allows to use an external source of knowledge in a straightforward manner by incorporating a second word embedding space in the model.